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OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 



October 15, 2004, 15:11:57 ; Search time 3717.85 Seconds 

(without alignments) 
10265.033 Million cell updates/sec 

US-10-070-532-1 
1278 

1 atggagccctcagccacccc tcaccacagtgctgccctga 1278 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 



Searched: 27513289 seqs, 14931090276 residues 

Total number of hits satisfying chosen parameters: 55026578 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : EST:* 

1: em_estba:* 

2: em_esthum:* 

3: em__estin:* 

4: em_estmu:* 

5: em_estov: * 

6: em_estpl:* 

7: em__estro:* 

8 : em_htc: * - 

9: gb_estl:* 
10: gb_est2:* 
11: gb_htc:* 
12: gb_est3:* 
13: gb_est4:* 
14: gb_est5:* 
15: em_estfun:* 
16: em_estom:* 
17 : em gss_hum: * 
18: em_gss_inv:* 
19: em_gss__pln: * 
20: em_gss_yrt:* 
2 1 : em_gs s_f un : * 
2 2 : em_gs s_mam : * 
2 3 : em_gs s_mus : * 
24: em_gss_pro:* 
25: em_gss__rod: * 
2 6 : em_gs s_phg : * 
2 1 : em_gs s vrl : * 



28: gb_gssl:* 
29: gb_gss2:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query / 

No. Score Match Length DB ID Description 





1 


751. 4 


58. 


8 


753 


29 


AY420885 


AY420885 


Homo sapi 


c 


2 


732.8 


57 . 


3 


886 


13 


BX433093 


BX433093 


BX433093 




3 


719.4 


56. 


3 


1740 


11 


BC035686 


BC035686 


Homo sapi 


c 


4 


692.8 


54 . 


2 


790 


14 


CF147830 


CF147830 


AGENCOURT 


c 


5 


676.4 


52 . 


9 


899 


13 


BX433092 


BX433092 


BX433092 




6 


662.8 


51. 


9 


750 


29 


AY420886 


AY420886 


Pan trogl 




7 


578.6 


45. 


3 


3470 


11 


AK048781 


AK048781 


Mus muscu 




8 


578. 6 


45. 


3 


3729 


11 


AK038551 


AK038551 


Mus muscu 




9 


575. 4 


45. 


0 


726 


29 


AY420887 


AY420887 


Mus muscu 




10 


567. 6 


44. 


4 


3153 


11 


AK079572 


AK079572 


Mus muscu 




11 


521 


40. 


8 


1790 


11 


BC035858 


BC035858 


Homo sapi 




12 


470.4 


36. 


8 


1001 


9 


AL535838 


AL535838 AL535838 




13 


468.4 


36. 


7 


520 


13 


BQ269289 


BQ269289 


ik23fl2.y 




14 


437 . 4 


34 . 


2 


892 


13 


BX409735 


BX409735 


BX409735 




15 


393.2 


30. 


8 


993 


12 


BM926746 


BM926746 


AGENCOURT 


c 


16 


386. 8 


30. 


3 


625 


13 


BQ285933 


BQ285933 


ik23fl2.x 




17 


376.2 


29. 


4 


543 


13 


BX119589 


BX119589 


BX119589 




18 


367 


28 . 


7 


788 


14 


CF147829 


CF147829 


AGENCOURT 


c 


19 


336. 2 


26. 


3 


1013 


9 


AL535837 


AL535837 AL535837 




20 


330. 8 


25. 


9 


382 


12 


BQ042116 


BQ042116 


sheepl Sh 


c 


21 


296 


23. 


2 


525 


12 


BI133700 


BI133700 


UI-M-BH3- 




22 


285.4 


22 . 


3 


635 


12 


BM939496 


BM939496 


UI-M-BH3- 




23 


265. 8 


20. 


8 
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10 


BB632359 


BB632359 


BB632359 




24 


265 . 4 


20. 


8 


599 


12 


BM933820 


BM933820 


UI-M-BH3- 




25 


263 . 2 


20 . 


6 
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13 


BY723922 


BY723922 


BY723922 




26 


216 . 8 


17 . 


0 


477 


12 


BM087401 


BM087401 


500158 MA 




27 


206 


16. 


1 


552 


10 


BE863072 


BE863072 


UI-M-BH0- 




28 


202 . 4 


15 . 


8 


662 


10 


BB632883 


BB632883 


BB632883 




29 


199.8 


15. 


6 


1073 


12 


BM920548 


BM920548 


AGENCOURT 




30 


198.4 


15. 


5 


245 


12 


BI976482 


BI976482 


485407 MA 




31 
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15. 


5 
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10 


BB651179 


BB651179 


BB651179 
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14. 


6 
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13 


BY239887 


BY239887 


BY239887 
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8 
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6 
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BX109847 


c 
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1 
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CE375359 
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tigr-gss- 




36 
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8 


1290 


29 


AY411591 


AY411591 


Homo sapi 




37 


162 


12. 


7 


721 


29 


CE235359 


CE235359 


tigr-gss- 




38 


157.8 


12. 


3 


1296 


29 


AY411593 


AY411593 


Mus muscu 




39 


134.6 


10. 


5 


257 


10 


AW427900 


AW427900 


64510 MAR 


c 


40 


127.2 


10. 


0 


1005 


28 


CC212654 


CC212654. 


CH261-75F 


c 


41 


127.2 


10. 


0 


1058 


28 


CC297061 


CC297061 


CH261-177 


c 


42 


125.4 


9. 


8 


564 


13 


BU680891 


BU680891 


UI-CF-EC1 


c 


43 


122.4 


9. 


6 


1194 


28 


CC279941 


CC279941 


CH261-24C 




44 
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9. 


4 
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29 


CG978334 


CG978334 


CH240 169 




45 
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9. 


4 
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10 
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ALIGNMENTS 



RESULT 1 
AY420885 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 

COMMENT 

FEATURES 

source 



Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Primates; Catarrhini; Hominidae; Homo. 



gene 



AY420885 753 bp DNA linear GSS 17-DEC-2003 

Homo sapiens HCRTR1 gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY420885 

AY420885.1 GI:39776842 
GSS. 

Homo sapiens (human) 
Homo sapiens 
Eukaryota; Metazoa; 
Mammalia; Eutheria; 

1 (bases 1 to 753) 

Clark,A.G., Glanowski, S . , Nielson,R., Thomas, P., Ke j ariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T . J. , Sninsky, J. J. , 
Adams, M. D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp-mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 to 753) 

Clark, A. G., Glanowski, S . , Nielson,R., Thomas, P., Ke jariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D. R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 
Adams, M.D. and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003 ) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence as made by sequencing genomic exons and ordering them 
based on alignment. 

Location/Qualifiers . 
1. .753 

/organism="Homo sapiens" 
/mol_type-" genomic DNA" 
/db_xref="taxon: 9606" 
<i. .>753 
/gene="HCRTRl" 
/locus_tag="HCM7373" 



ORIGIN 



Query Match 58.8%; 
Best Local Similarity 99.9%; 
Matches 752; Conservative 



Score 751.4; DB 29; 
Pred. No. 4.5e-140; 
0; Mismatches 1; 



Length 753; 
Indels 0; 



Gaps 



0; 



Qy 



Db 



526 ATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGC 585 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1 ATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGC 60 



QY 
Db 



586 ACAC GGCTCTT CT CAGT CTGTGAT GAACGCT GGGCAGATGACCT CTATCC CAAGATCT AC 645 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

61 ACAC GGCTCTT CT CAGT CTGT GAT GAACGCT GGGCAGATGACCT CTATCCCAAGATCT AC 120 



Qy 

Db 



646 
121 



705 
180 



Qy 7 06 TTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTG 765 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 TTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTG 24 0 

Qy 766 CGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAG 825 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 CGGAACT GGAAGC GCCCCT CAGACCAGCTGGGGGACCT GGAGCAGGGCCT GAGT GGAGAG 300 

Qy 826 CCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAG 885 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 CCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAG 360 

Qy 886 ACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGC 945 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGC 420 

Qy 946 GTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCT 1005 

I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 421 GTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCT 480 

Qy 1006 GTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCC 1065 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 GTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCC 540 

Qy 1066 ATCATCTACAACTTCCTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGC 1125 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 ATCATCTACAACTTCCTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGC 600 

Qy 1126 TGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCC 1185 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 

Db 601 TGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCC 660 

Qy 1186 AG C CAC AAGT C CT T GT C CT T GCAGAGC C GAT GCTCCGTCTC C AAAAT CT CT GAGC AT GT G 1245 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 AGC CACAAGT C CT T GT C CTT GCAGAGC CGAT GCT CC AT CT C CAAAAT CTCT GAGCAT GTG 720 

Qy 1246 GTGCTCACCAGCGTCACCACAGTGCTGCCCTGA 127 8 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 GT GCT CAC C AGC GT CAC C ACAGT GCT GCCCT GA 753 



RESULT 2 

BX433093/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BX433093 886 bp mRNA linear EST 15-MAY-2003 

BX433093 Homo sapiens FETAL BRAIN Homo sapiens cDNA clone 
CS0DF013YE04 3-PRIME, mRNA sequence. 
BX433093 

BX433093.1 GI:30779168 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 886) 

Li,W.B., Gruber,C, Jessee, J. and Polayes,D. 
Full-length cDNA libraries and normalization 
Unpublished (2001)' 
Contact: Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

Email: seqref @genoscope . ens . f r , Web : www. genoscope . ens . fr 
Library was constructed by Life Technologies , a division of 
Invitrogen. This sequence belongs to sequence cluster 151. r For 
more information about this cluster, see 
http : //www. genoscope . ens . f r/ 

cgi-bin/cluster.cgi?seq=CSOBAI011ZB01_CS00962_2&cluster=151.r. 
Contact : Feng Liang Email : fliang@lifetech.com URL : 
http://fulllength.invitrogen.com/ InVitroGen Corporation 1600 
Faraday Avenue Genoscope sequence ID : CS0BAI011ZB01_CS00962_2 . 

Location/Qualifiers' 

1. .886 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="CS0DF013YE04" 
/tissue_type="FETAL BRAIN" 
/dev_s tage-" fetal " 

/clone_lib="Homo sapiens FETAL BRAIN" 

/note="0rgan: brain; Vector: pCMVSPORT_6; 1st strand cDNA 
was primed with a Notl-oligo (dT) primer. Five prime end 
enriched, double-strand cDNA was digested with Not I and 
cloned into the Not I and EcoRV sites of the pCMVSPORT 6 
vector. Library was not normalized." 



ORIGIN 



Query Match 57.3%; 
Best Local Similarity 98.9%; 
Matches 737; Conservative 



Score 732.8; DB 13; 
Pred. No. 2.6e-136; 
0; Mismatches 8; 



Length 886; 



Indels 



0; Gaps 



0; 



Qy 



Db 



377 AGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGT 436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

745 AGGCTGTGTCCGTGTCAGTGACAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGT 686 



Qy 



Db 



437 ATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCC 496 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I LI I I I I I I M I I 

685 ATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATNC 626 



Qy 

Db 

Qy 

Db 



497 TGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCA 556 

I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

625 TGGGCATCTGGCCTGTGTCGCTGGCCATCATGGTGCCCAGGGCTGCAGTCATGCAATGCA 566 



.557 



616 



GCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

565 GCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCT 506 



Qy 

Db 



617 



505 



GGGCAGAT GACCT CT AT CC CAAGAT CT AC CAC AGTT G CT T CTT TAT T GT CAC CT AC CT GG 67 6 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTTATTGTCACCTACCTGG 446 



Qy 


677 


Db 


445 


Qy 


737 


Db 


385 


Qy 


797 


Db 


325 


Qy 


857 


Db 


265 


Qy 


917 


Db 


205 


Qy 


977 


Db 


145 


Qy 


1037 


Db 


85 


Qy 


1097 


Db 


25 



CCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I M I I I 

CCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCC 

AGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGG 

GGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I 

GGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTG 

AAGT GAAGCAGATG C GT GCAC G GAGGAAG AC AGCCAAGAT GCT GAT GGT GGTGCTGCTGG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AAGT GAAGCAGAT GC GT GCACGGAGGAAGAC AGCCAAGAT GCT GAT GGT GGT GCT G CT GG 



736 



386 



796 



326 



856 



266 



916 



206 



976 



TCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGA 146 



TGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGC 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

TGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGC 



1036 



86 



TGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTCCTCAGTGGCAAATTCC 1096 

I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

T GGT GT AC GC CAAC AGC GCTGC CAACC C CAT CAT CT ACAACTT C CT CAGT GG CAAATT C C 2 6 

GGGAGCAGTTTAAGGCTGCCTTCTC 1121 

I I I I I I I I I I I I I I I I I I II I I I 

GGGAGCAGTTTAAGGCATCCTTCTC 1 



RESULT 3 
BC035686 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



BC035686 1740 bp rnRNA ^ linear HTC 20-SEP-2002 

Homo sapiens, Similar to hypocretin (orexin) receptor 1, clone 
IMAGE: 5750551, rnRNA. 
BC035686 

BC035686. 1 GI : 23242909 
HTC. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to £740) ' " v 

Strausberg, R. 

Direct Submission 

Submitted ( 31- JUL-2002 ) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 

Contact: MGC help desk 

Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Life Technologies, Inc. 

cDNA Library Preparation: Life Technologies, Inc. 



cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: National Institutes of Health Intramural 

Sequencing Center (NISC) , 

Gaithersburg, Maryland; 

Web site: http://www.nisc.nih.gov/ 

Contact : nisc_mgc@nhgri . nih . gov 

Akhter,N., Ayele,K., Beckstrom-Sternberg, S .M. , Benjamin, B., 
Blakesley,R.W. , Bouf f ard, G. G . , Breen,K., Brinkley,C, Brooks, S., 
Dietrich, N.L. , Granite, S., Guan,X., Gupta, J., Haghighi,P., 
Hansen, N., Ho,S.-L., Karlins,E., Kwong,P., Laric,P., Legaspi,R., 
Maduro, Q. L. , Masiello, C. , Maskeri, B. , Mastrian, S . D. , McCloskey, J. C . , 
McDowell, J. , Pearson, R. , Stantripop, S . , Thomas, P. J., Touchman, J. W . , 
Tsurgeon,C, Vogt,J.L., Walker, M. A., We the rby , K . D . , Wiggins, L., 
Young, A., Zhang, L.-H. and Green, E.D. 

Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consortium/ LLNL at: http://image.llnl.gov 
Series: IRAK Plate: 79 Row: m Column: 17 

This clone was selected for full length sequencing because it 
passed the following selection criteria: matched mRNA gi : 4557636 
This clone has the following problem: frame shifted. 
FEATURES Location/Qualifiers 
source 1. .1740 

/organism="Homo sapiens" 

/ mol type— "mRNA" 

/db_xref="taxon: 9606" 
/ cl one= " IMAGE :5750551" 

/tissue_type="Lung, Spleen, fetal, pooled" 
/clone_lib="NIH_MGC_122" 
/labJiost="DH10B" 
/note="Vector: pCMV-SP0RT6" 

ORIGIN 

Query Match 56.3%; Score 719.4; DB 11; Length 1740; 

Best Local Similarity 83.5%; Pred. No. 1.7e-133; 

Matches 909; Conservative 0; Mismatches 1; Indels 179; Gaps 1;. 

ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 



I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I 



TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 685 

CT GGT GGGCAAC AC G CTGGT CTGCCT GGC CGT GT GG CGGAAC CAC CACAT GAGGACAGT C 24 0 
I I I I I I I I I I I I I I I I II 

CTGGTGGGCAACACGCTG 703 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 



Qy 


i 


Db 


506 


Qy 


61 


Db 


566 


Qy 


. 121 


Db 


62 6 


Qy 


181 


Db 


686 


Qy 


241 



Db 



704 



703 



301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 
704 703 



361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

704 GGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 746 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

747 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 806 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

807 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 8 66 

541 GCAGT CAT GGAAT G CAG CAGT GT GCT GCCT GAGCTAGC CAAC CGCACAC GGCT CT TCT CA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I 
867 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 926 

601 GTCT GT GAT GAACGCT GGGCAGAT GAC CT CT AT CC CAAGAT CT AC CAC AGTT GCT T CTT T 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
927 GT CT GT GAT GAACGCT GG GCAGAT G AC CT CT AT C C CAAGAT CT ACCAC AGT T GC T T CT T T 986 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I 

987 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 1046 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1047 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 1106 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1107 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 1166 

841 C GCGCCT T CCT GGCT GAAGT GAAG CAGAT GC GT GCAC GGAGGAAGACAGCCAAGAT GCT G 900 
I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1167 C GC GCCT T C CT GGCT GAAGT GAAGCAGAT G C GT GCAC GGAGGAAGAC AGC CAAGATGCT G 1226 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
1227 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 128 6 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I'M I I I I I I I I I I I I I I II II I I I I I I I 

1287 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGC'GAAGCTGTCTACGCCTGCTTC 1346 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1347 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1406 

1081 CTCAGTGGC 1089 

I I I I I I I I I 
1407 CTCAGTGGC 1415 



RESULT 4 

CF147830/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



CF147830 790 bp mRNA linear EST 25-JUL-2003 

AGENCOURT_14740202 NIH__MGC_145 Homo sapiens cDNA clone 
IMAGE: 6971889 5', mRNA sequence. 
CF147830 

CF147830.1 GI:33244098 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 790) 

NIH-MGC http : //mgc . nci . nih . gov/ . 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Daniela S. Gerhard, Ph.D. 
Office of Cancer Genomics 
National Cancer Institute / NIH 
Bldg. 31 RmlOA07 Bethesda, MD 20892 
Email: cgapbs-r@mail.nih.gov 
Tissue Procurement: GPCR Consortium 
cDNA Library Preparation: GPCR Consortium 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Agencourt Bioscience Corporation 

Clone distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http://image.llnl.gov 

Plate: IRBI02 row: a column: 08 

High quality sequence start: 7 

High quality sequence stop: 738. 
Location/Qualif iers 
1. .790 

/organism="Homo sapiens" 

/mol_type="mRNA" 

/db_xref="taxon:9606" 

/ clone= " IMAGE :6971889" 

/ tis sue_type="mixed" 

/lab_host="DH10B" 

/ clone_lib="NIH_MGC_14 5 " 

/note="Vector : pcDNA3.1; Site_l : varies by clone; Site_2 : 
varies by clone; ORFs were PCR- amplified and cloned into 
pcDNA3.1 by the GPCR Consortium. Cloning sites vary by 
clone and include the following: 5 1 -EcoRV-XmnI/XhoI-3 1 , 
5 '-EcoRV-XmnI/NotI-3 1 , EcoRV (TA cloned, non-directional). 
For information, about which gene each clones represents, 
please visit our anonymous ftp site at 

ftp: // image . llnl . gov/ image/ rearrayed_plates/lRBI . preSV. dat 
a Note: this is a NIH_MGC Library." 



ORIGIN 



Query Match 54.2%; Score 692.8; DB 14; Length 790; 

Best Local Similarity 99.7%; Pred. No. 2.5e-128; 

Matches 694; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 



583 C GC ACAC GGCT CTT CTC AGT CT GT GAT GAAC G CT GG G CAGAT GAC CT CTAT C CCAAGAT C 642 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 790 CGCACAC GGCT CTT CT CAGT CT GT GAT GAAC GCT GGG CAGAT GAC CT CTAT C C CAAGAT C 731 

Qy 643 TACCACAGTTGCTTCTTTATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCC 702 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 730 TACCACAGTTGCTTCTTTATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCC 671 

Qy 703 TATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTG 7 62 

I I I I I I I I II I I I I I I U I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 670 TATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTG 611 



Qy 763 GTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGA 822 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 610 GTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGA 551 

Qy 823 GAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGG 882 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 550 GAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGG 4 91 

Qy 8 83 AAGACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATC 942 

I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 490 AAGACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATC 431 

Qy 943 AGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAA 1002 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 430 AGC GT C CT C AAT GT C CTTAAGAGGGT GT T C GGGAT GT T C CGC CAAGC CAGT GAC C GC GAA 371 



Qy 1003 GCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAAC 1062 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 370 GCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAAC 311 

Qy 1063 CCCATCATCTACAACTTCCTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCC 1122 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 310 CCCATCATCTACAACTTCCTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCC 251 

Qy 1123 TGCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCT 1182 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 250 TGCTGCCTGCCTGGCCTGGGCCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCT 191 

Qy 1183 GCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCAT 1242 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 190 GCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCAT 131 

Qy 124 3 GTGGTGCTCACCAGCGTCACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 130 GTGGTGCTCACCAGCGTCACCACAGTGCTGCCCTGA 95 



RESULT 5 

BX433092/C 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



BX433092 899 bp . mRNA linear EST 15-MAY-2003 

BX433092 Homo sapiens FETAL BRAIN Homo sapiens cDNA clone 
CS0DF013YE04 3-PRIME, mRNA sequence. 
BX433092 

BX433092.1 GI: 30779167 
EST. 

Homo sapiens (human) 
Homo sapiens 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 899) 

Li,W.B., Gruber,C, Jessee, J. and Polayes,D. 
Full-length cDNA libraries and normalization 
Unpublished (2001) 
Contact: Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

Email: seqref@genoscope.cns.fr, Web : www.genoscope.cns.fr 
Library was constructed by Life Technologies, a division of 
Invitrogen. This sequence belongs to sequence cluster 151. r For 
more information about this cluster, see 
http://www.genoscope.cns.fr/ 

cgi-bin/ cluster . cgi?seq=CSOBAI011ZB01_CS00962_l&cliister=151- r. 
Contact : Feng Liang Email : fliang@lifetech.com URL : 
http://fulllength.invitrogen.com/ InVitroGen Corporation 1600 
Faraday Avenue Genoscope sequence ID : CS0BAI011ZB01_CS00962_1 . 

Location/Qualifiers 

1. .899 

/organisms "Homo sapiens" 

/mol_type="mRNA" 

/ db__xr e f = " t axon : 9 6 0 6 " 

/clone="CS0DF013YE04 " 

/tissue_type="FETAL BRAIN" 

/dev_stage=" fetal" 

/clone_lib="Homo sapiens FETAL BRAIN" 

/note="0rgan: brain; Vector: pCMVSP0RT_6; 1st strand cDNA 
was primed with a Notl-oligo (dT) primer. Five prime end 
enriched, double-strand cDNA was digested with Not I and 
cloned into the Not I and EcoRV sites of the pCMVSPORT 6 
vector. Library was not normalized." 



ORIGIN 



Query Match 52.9%; Score 676.4; DB 13;- Length 899; 

Best Local Similarity 96.6%; Pred. No. 5.2e-125; 

Matches 711; Conservative 0; Mismatches 23; Indels 2; Gaps 



2; 



Qy 



Db 



372 TCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCG 431 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

738 TGTAAGNNCTGTGTCGTGTTCAGTGGCAGTGCTACTTCTCAGCTTCATCGCCTGGACCCG 67 9 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



,432 CTGGTATGCCATC-TGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCT 490 
I I I I I I I I II I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I 

678 CTGGTATGCCATCATCCCACCCACTATTGTCAAAGAGCACAGCCCGGCGGGCCCGTGCTC .619 

491 CCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGG 550 

I I I I I I 1.1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

618 CCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGG 559 

551 AATGCAGCAGTGTGCTGCCTGAGCTAGCC7UVCCGCACACGGCTCTTCTCAGTCTGTGATG 610 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

558 AAT GCAGCAGT GT GCT GCCT GAGCTAGCCAACCGCACACGGCT CTT CT CAGT CT GT GATG 499 

611 AACGCT GGGCAGAT GACCT CTAT CCCAAGAT CT ACCACAGTT GCTT CTTT ATT GT CACCT 670 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

498 AAC GCT GG GCAGAT GAC CT CTAT CC CAAGAT CT ACCACAGTT GCTT CTT TATT GT CAC CT 439 



Qy 

Db 



671 
438 



730 



379 



Qy 731 G C CG C CAGAT C C CCGGCAC CAC CT C AGCACT GGT G C GGAACT GGAAGC GC CC CT CAGAC C 790 

I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 37 8 GCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACC 319 

Qy 791 AGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCC 850 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I 
Db 318 AGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCC 259 

Qy 851 T GGCT GAAGT GAAGC AGAT GC GT GCACGGAG GAAGAC AGCCAAGAT GCT GAT GGT GGT GC 910 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 258 T GGCT GAAGT GAAG CAGAT GC GT G CACGGAGGAAGACAGCCAAGAT GCT GAT GGT G GT GC 199 

Qy 911 TGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGT 97 0 

I | | | | | | | | | I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 198 TGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGT 139 

Qy 971 TCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCC 1030 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 138 TCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCC 7 9 

Qy 1031 ACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTCCTCAGTGGCA 1090 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 7 8 ACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCT-CAACTTCCTCAGGTGCA 20 



Qy 1091 AATTCCGGGAGCAGTT 1106 

I I I I I I I I I I I I I I I I 

Db 19 AATTCCGGGAGCAGTT 4 



RESULT 6 
AY420886 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
PUBMED 
REFERENCE 
AUTHORS 



Craniata ; Vertebrata ; Euteleos tomi ; 
Catarrhini; Hominidae; Pan. 



AY420886 750 bp DNA linear GSS 17-DEC-2003 

Pan troglodytes HCRTR1 gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY420886 

AY420886.1 GI: 39776843 
GSS. 

Pan troglodytes (chimpanzee) 
Pan troglodytes 
Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (bases 1 to 750) 

Clark,A.G., Glanowski, S .', Nielson,R., Thomas,?., Ke jariwal , A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D. R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 
Adams, M-D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp-mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 
Clark, A. G. , 



to 750) 
Glanowski, S . 



Nielson, R. , Thomas, P . , Kejariwal, A. , 



Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B . , 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 
Adams, M.D. and Cargill,M. 
TITLE Direct Submission 

JOURNAL Submitted ( 16-NOV-2003) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 
COMMENT This sequence as made by sequencing genomic exons and ordering them 

based on alignment. 
FEATURES Location/Qualif iers 

source 1 . . 750 

/organism="Pan troglodytes" 
/mol_type= ,f genomic DNA" 
/db_xref ="taxon : 9598 " 
gene <1. .>750 

/ gene= M HCRTRl" 
/locus_tag="HCM7373" 

ORIGIN 

Query Match 51.9%; Score 662.8; DB 29; Length 750; 

Best Local Similarity 88.5%; Pred. No. 2.5e-122; 

Matches 664; Conservative 0; Mismatches 8 6; Indels 0; Gaps 0; 

Qy 526 ATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGC 585 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGC 60 

Qy 58 6 ACAC GGCT CTT CTCAGTCT GT GAT GAAC GCT GGGC AGAT GACCTCT AT C C CAAGAT CT AC 645 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 ACAC GGCT CTT CT CAGT CT GT GAT GAAC GCT GG G CAGAT GAC CTCT AT C C CAAGAT CT AC 120 

Qy 64 6 CACAGTTGCTTCTTTATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTAT 7 05 

I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I 

Db 121 CACAGTTGCTTCTTTATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTAT 180 

Qy 706 TTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTG 7 65 

I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 181 TTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTG 240 

Qy 766 C GGAACT G GAAG C G C C C CT CAGACC AGCT GGGGGAC CT GGAGCAGGGC CT GAGTGGAGAG 825 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 C GGAACT G GAAGCGC CC CT CAGAC C AGCT GG GGGAC CT GGAGCAGGGC CT GAGT GGAGAG 300 

Qy 826 CCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAG 885 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

Db 301 CCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCGCGGAGGAAG 360 

Qy 8 86 ACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGC 945 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 ACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGT 420 

Qy 946 GTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCT 1005 

I I II I I I I I I I I I I II I I I I 

Db 421 GTCCTCAATGTCCTTAAGAGNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN 480 

Qy 1006 GTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCC 1065 

I II I I II I I I I I I I I I I I I 

Db 481 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNACGCCAACAGNNNNGCCAACCCC 540 



Qy 

Db 



1066 
541 



ATCATCTACAACTTCCTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATCANNNACAACTTCCTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGC 



1125 
600 



Qy 1126 TGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCC 1185 

I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 TGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCC 660 

Qy 1186 AGCCACAAGT CCT T GT C CT T GCAGAG C C GAT GCT CC GT CT CCAAAAT CT CT GAGCAT GT G 1245 

i 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 661 AG C CACAAGT CCT T GT C CTT GCAGAGC C GATGCT CCGT CT CCAAAAT CT CT GAGCAT GT G 72 0 

Qy 1246 GT GCT C AC C AGC GT C AC CACAGT GCT GC C C 1275 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 GT GCT CACCAGCGT CAC CACAGT GCT GCCC 750 



RESULT 7 
AK048781 
LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 



AK048781 3470 bp mRNA linear HTC 20-SEP-2003 

Mus musculus 0 day neonate cerebellum cDNA, RIKEN full-length 
enriched library, clone: C230065B06 product : OREXIN RECEPTOR TYPE 2, 
full insert sequence. 
AK048781 

AK04 8781. 1 GI: 26339571 
HTC; CAP trapper. 
Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki, Y . 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 ' 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 
Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 
Normalization and subtraction of cap-trapper-selected cDNAs to 
prepare full-length cDNA libraries for rapid discovery of new genes 
Genome Res. 10 (10), 1617-1630 (2000) 
20499374 
11042159 
^3 

Shibata,K 



Itoh,M., Aizawa,K. 



Sasaki, N. , 
, Tashiro,H. 



Carninci, P . 
, Itoh,M., 
Harada, A. , 



Nagaoka, S . , 

Konno,H., Akiyama,J., Nishi,K., Kitsunai,T. 
Sumi,N., Ishii,Y., Nakamura,S., Hazama,M., Nishine,T., 
Yamamoto, R. , Matsumoto, H . , Sakaguchi , S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 



PUBMED 11076861 
REFERENCE 4- 

AUTHORS The RIKEN Genome Exploration Research. Group Phase II Team and the 
FANTOM Consortium. 

TITLE Functional annotation of a full-length mouse cDNA collection 

JOURNAL Nature 409, 685-690 (2001) 
REFERENCE 5 

AUTHORS The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

TITLE Analysis of the mouse transcriptome based on functional annotation 

of 60,770 full-length cDNAs 

JOURNAL Nature 420, 563-573 (2002) 
REFERENCE 6 (bases 1 to 3470) 

AUTHORS Adachi,J., Aizawa,K., Akimura,T., Arakawa,T. f Bono,H., Carninci,P., 
Fukuda,S., Furuno,M., Hanagaki,T., Hara,A. , Hashizume, W. , 
Hayashida, K. , Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M., Kagawa,I., Kasukawa,T., 
Katoh,H., Kawai,J., Kojima,Y., Kondo,S., Konno,H., Kouda,M., 
Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki,A., Murata,M. , 
Nakamura,M., Nishi,K., Nomura, K., Numazaki,R., Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A. , Takahashi, F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A. , Toya,T., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y. 

TITLE Direct Submission 

JOURNAL Submitted ( 16- JUL-2001 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome-res @gsc. riken. go . jp, 
URL :http: //genome. gsc. riken. go. jp/, Tel : 81-45-503-9222 , 
Fax:81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL:http: //genome . gsc . riken. go. jp/ 
URL:http: //f antom. gsc . riken. go. jp/ . 
FEATURES Location/Qualifiers 
source 1. .3470 

/organism="Mus musculus" 

/mol_type= ,, mRNA n 

/strain="C57BL/6J" 

/db_xref= n FANTOM_DB:C230065B06" 

/db_xr e f = "MGI :2415851" 

/db_xref="taxon: 10090" 

/clone="C230065B06" 

/tissue_type=" cerebellum" 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="0 day neonate" 
CDS 75. .1457 

/note="unnamed protein product; OREXIN RECEPTOR TYPE 2 
(SWISSPROTI P56719, evidence: FASTY, 98.5%ID, 100%length, 
match=1380) 



polyA_signal 
polyA_site 
ORIGIN 



putative" 
/codon_start=l 
/protein_id="BAC33457. 1" 
/db_xref="GI: 26339572" 

/ translation="MSSTKLEDSLSRRNWSSASELNETQEPFLNPTDYDDEEFLRYLW 

REYLHPKEYEWVXIAGYIIVFWALIGNVXVCVAWKNHHMRTVTNYFIWLSIJ^ 

WITCLPATLWDITETWFFGQSLCKVIPYLQTVSVSVSVLTLSCIALDRWYAICHPL 

MFKSTAKRARNSIWIWIVSCIIMIPQAIWECSSMLPGIANKTTLFTVCDEHWGGEV 

YPKMYHICFFLVTYMAPLCLMILAYLQIFRKLWCRQIPGTSSWQRKWKQQQPVSQPR 

GSGQQSKARI SAVAAEI KQI RARRKT ARMLMWL L VFAI CYLPI S I LNVLKRVFGMFT 

HTEDRETVYAWFTFSHWLVYANSAANPIIYNFLSGKFREEFKAAFSCCLGVHHRQGDR 

LARGRTSTESRKSLTTQISNFDNVSKLSEHWLTSISTLPAANGAGPLQNWYLQQGVP 

SSLLSTWLEV" 

3455. .3460 

/note= M putative" 

3470 

/note="putative" 



Query Match 45.3%; 
Best Local Similarity 69.0%; 
Matches 826; Conservative 



Score 578.6; DB 11; 
Pred. No. 3.7e-105; 
0; Mismatches 359; 



Length 347 0; 
Indels 12; Gaps 



2; 



Qy 



Db 



8 0 AT GAAGAT GAGT TT CT C C GCT AT CT GT GGCGT GAT T AT CT GT AC C CAAAAC AGTAT GAGT 139 
I II II II II II II II I I I II I I II II II I I I I III I I I I I I I I 
17 8 ACGAC GAGGAAT T C CT GC GGT AC CT GT GGAGGGAAT AC CT ACAC C C GAAAGAATAT GAGT 237 



Qy 

Db 

Qy 

Db 

Qy 

Db 



14 0 GGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCCCTGGTGGGCAACACGCTGG 199 

I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

238 GGGTCCTGATCGCAGGGTATATCATCGTGTTCGTTGTGGCTCTCATCGGGAACGTCCTGG 297 

200 T CT GC CT GGCC GT GT GGC GGAACCACCACAT GAGGACAGT CACCAACT ACT T CATT GT C A 259 
I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

298 T CT GT GT GGCAGTGT GGAAGAACC AC C ACAT GAGGACAGT CAC CAACT ACT T C AT AGT CA 357 

2 60 ACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGG 319 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

358 ACCTTTCCCTAGCAGATGTGCTTGTGACCATCACCTGCCTTCCAGCTACCCTCGTTGTTG 417 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



32 0 ACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGG 379 

I I I I I I I I I I I I III I II II II I I I I I I I MINIM II III I I I I I 
418 ACAT CACT GAGACTT GGT T CT T T GGAC AGT CC CT CT GTAAGGT CAT T C CTT ATTT ACAGA 477 

380 CTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATG 439 

I I I I I I I I I I I I III I II II II I I I II III III I I I I I I I I I I I I I 
478 CTGTGTCAGTGTCTGTGTCTGITCTTACGTTGAGCTGCATTGCCTTGGACCGATGGTACG 537 

440 CCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGG 499 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I Mill M I I I I I I I 

538 C CAT T T GT CACCCT T TGAT GT T CAAGAGCAC AGC CAAAC GGGCT CGAAACAGCAT C GT T G 597 

500 GCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCA 559 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

598 T CATCTGGAT CGTCT CCT GCAT CATAATGATT CCT CAAGCCATTGT CAT GGAGTGCAGCA 657 



Qy 



560 GTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGG 
I I I I I I I I I M II I I II II I I I I I I I I I I I I I I I I I I I Mill 



619 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



658 GCAT GCT C CCT GGC CT AGCCAATAAGAC CAC C CT CT TT ACAGT AT GT GAT GAACACT G G G 717 



620 



718 



680 



778 



740 



838 



800 



892 



860 



952 



920 



C AGAT GAC C TCTAT C CC AAGAT CTAC CACAGT T G CT T CTTT AT T GT CAC CT AC CT GGC C C 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
GC GGT GAAGTT TAC C CAAAGAT GT AC CAT AT CT GCTT CT TT CT GGT GACAT ACAT GGC AC 

CACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGA 
I III III III I I I I III I I I I I I II I I I I II I I I I I I I I I I I I II 

CT CT GT GT CTT ATGAT AT T GGCTT AT CT C CAAAT AT TC C GTAAACT CT GGT GC CGACAGA 

TCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

T T CC C GGAACT T CT T CT GT GGT T CAGAGAAAAT GGAAGCAGC AGCAGCCGGTTT 

ACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAG 

II I III I II I II I I I I I I I I I Mill 
CTCAGCCCCGGGGGTCCGGACAGCAGAGCAAGGCTCGGATTAGCGCTGTTGCTGCTGAGA 



679 



777 



739 



837 



799 



891 



859 



951 



919 



T GAAGCAGAT GC GT GCAC GGAGGAAGAC AGC CAAGAT G CT GAT GGT GGT GCT GCT GGT CT 
I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I 
T AAAGCAGATC C GAGCAC GAAGGAAAAC AGCC C GGAT GCT CAT GGT T GT ACT T CT GGTCT 1011 



979 



TCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGT 

I II I I I I I I II II I I I I I I I I I I I I I I I I II I I I II II II I I I II I I 

1012 TT GCAATTTGCTAT CTACCAAT CAGCATC CTCAAT GTGCTAAAGAGAGTATTT GGGAT GT 1071 

980 TCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGG 1039 

II II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
1072 T CACACACACGGAAGACAGAGAGACT GT CTATGCTTGGTTCACTTTTT CTCATTGGCTT G 1131 

104 0 TGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTCCTCAGTGGCAAATTCCGGG 1099 

I II I I I I I I I I I I I I I I I I I I II II II II II II I I I I I I I I I I II I 
1132 TAT AT GC CAACAGT GCT GCAAAC CCAAT TAT T T ATAAT T TT CTT AGT GGAAAATT T CGAG 1191 

1100 AGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGA 1159 

I I I I I I I I I I I I I I I I I I I I I I I I I I III II I 

1192 AGGAATTTAAAGCTGCCTTTTCTTGTTGTCTTGGGGTTCATCATCGCCAAGGAGACCGCC 1251 

1160 AGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGATGCT 1219 

III III III III I I I I I I I I I I I I I I I I II 

1252 T C GC CAGGG GAC GCAC GAGCACAGAGAGCAGGAAGT C C CT GACCACACAGAT CAGCAACT 1311 

1220 CCGT CT CCAAAAT CT CT GAGCAT GT GGT GCT CACCAGCGT CAC CACAGT GC 1270 

II II III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1312 T T GACAAT GT AT CAAAACT CT CAGAGCAC GT GGT GCT CAC C AGCAT AAGCAC ACT C C 1368 



RESULT 8 
AK038551 
LOCUS 

DEFINITION 



ACCESSION 

VERSION 

KEYWORDS 



SOURCE 



AK038551 3729 bp mRNA linear HTC 19-SEP-2003 

Mus musculus adult male hypothalamus cDNA, RIKEN full-length 
enriched library, clone :A230036M08 product : OREXIN RECEPTOR TYPE 2, 
full insert sequence. 
AK038551 

AK038551.1 GI: 26332642 

HTC; CAP trapper. 

Mus musculus (house mouse) 



ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 



Carninci, P . , 

Itoh,M. , 
Harada,A. , 



TITLE 
JOURNAL 



Mus mus cuius 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci, P. and Hayashizaki, Y. 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci, P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 

Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new genes 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata,K., Itoh,M. , Aizawa,K., Nagaoka,S., Sasaki, N. 
Konno,H., Akiyama,J., Nishi,K., Kitsunai,T., Tashiro,H. 
Sumi,N., Ishii,Y., Nakamura,S., Hazama,M., Nishine,T., 
Yamamoto,R., Matsumoto, H . , Sakaguchi, S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu,M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 3729) 

Adachi,J., Aizawa,K., Akimura,T., Arakawa,T., Bono,H., Carninci, P., 
Fukuda,S., Furuno,M., Hanagaki,T., Hara,A. , Hashizume, W. , 
Hayashida, K. , Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M., Kagawa,!., Kasukawa,T., 
Katoh,H., Kawai,J., Kojirna, Y. , Kondo, S . , Konno,H., Kouda,M., 
Koya,S., Kurihara,C, Matsuyama, T . , Miyazaki,A., Murata,M., 
Nakamura,M., Nishi,K., Nomura, K. , Numazaki,R., Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume,N., 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M., Tagawa,A., Takahashi, F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A., Toya,T., Yasunishi, A. , 
Muramatsu,M. and Hayashizaki, Y. 
Direct Submission 

Submitted ( 16- JUL-2001 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 



Exploration Research Group, RIKEN Genomic Sciences Center (GSC), 
RIKEN Yokohama Institute; 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan (E-mail : genome- res @gsc. riken . go . jp, 
URL : http://genome.gsc. riken. go. jp/, Tel : 81-4 5-503-9222 , 
Fax: 81-45-503-9216) 
COMMENT cDNA library was prepared and sequenced in Mouse Genome 

Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL: http : //genome . gsc . riken. go . jp/ 
URLrhttp: / /fantom.gsc. riken. go. jp/ . 
FEATURES Location/Qualifiers 
source 1. .3729 

/organism="Mus musculus" 

/mol__t ype= "mRNA" 

/strain="C57BL/6J" 

/db_x r e f = " FANTOM_DB : A2 3 0 0 3 6M0 8 " 

/db_xref="MGI: 2402981" 

/db_xref="taxon: 10090" 

/ clone- " A2 3 0 0 3 6M0 8 " 

/sex="rnale" 

/ tissue_type="hypothalamus" 

/clone__lib="RIKEN full-length enriched mouse cDNA library" 
/dev_stage="adult" 
CDS 76. .1458 

/note="unnamed protein product; OREXIN RECEPTOR TYPE 2 
(SWISSPROTI P56719, evidence: FASTY, 98.5%ID, 100%length, 
match=1380) 
putative" 
/codon_start=l 
/protein_id="BAC30039. 1" 
/db_xref="GI: 26332643" 

/translation="MSSTKLEDSLSRRNWSSASELNETQEPFLNPTDYDDEEFLRYLW 
REYLHPKEYEWLIAGYIIVFWALIGNVLVCVAVWKNHHMRTVTNYFIVNLSLADVL 
VTITCLPATLWDITETWFFGQSLCKVIPYLQTVSVSVSVLTLSCIALDRWYAICHPL 
MFKSTAKRARNSIWIWIVSCIIMIPQAIVMECSSMLPGLANKTTLFTVCDEHWGGEV 
YPKMYHICFFLVTYMAPLCLMILAYLQIFRKLWCRQIPGTSSWQRKWKQQQPVSQPR 
GSGQQS KAR I SAVAAE I KQ I RARRKT ARMLMWLLVFAI C YL P I S I LNVLKRVFGMFT 
HTEDRETVYAWFTFSHWLVYANS7WJPIIYNFLSGKFREEFKAAFSCCLGVHHRQGDR 
LARGRT S T E S RK S LT T QI S N FDN VS K L S EHWLT S I S T L PAANGAGP LQNW YLQQ GVP 
SSLLSTWLEV" 

polyA_signal 3712. .3717 

/ note="putative" 

polyA_site 3729 . 

/note="putative" 

ORIGIN 

Query Match 45.3%; Score 578.6; DB 11; Length 3729; 

Best Local Similarity 69.0%; Pred. No. 3.8e-105; 

Matches 826; Conservative 0; Mismatches 359; Indels 12; Gaps 2; 

Qy 8 0 AT GAAGAT GAGTTTCTCCGCTATCT GTGGCGTGATT ATCTGT ACCCAAAACAGTAT GAGT 139 

I II II II II II II II I I I I II I II II II I I II III I I I I I I I I ' 
Db 179 AC GAC GAGGAAT T C CT GC GGT AC CT GTGGAGGGAAT AC CTAC ACC C GAAAGAATAT GAGT 238 



140 GGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCCCTGGTGGGCAACACGCTGG 199 

I i 1 1 1 1 1 1 1 1 1 1 1 1 Mir 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1. I 1 1 1 1 1 1 1 1 1 

239 GGGTCCTGATCGCAGGGTATATCATCGTGTTCGTTGTGGCTCTCATCGGGAACGTCCTGG 298 

Qy 200 TCTGCCTGGCCGTGT GG C GGAAC CAC C AC AT GAGGACAGT CAC C AACT ACT T CAT T GT CA 259 

I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 299 T CT GT GT GGC AGT GT GGAAGAAC CACC ACAT GAGGACAGT C ACCAACT ACT T CATAGT CA 358 

Qy 260 ACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGG 319 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I III I I I I 

Db 359 ACCTTTCCCTAGCAGATGTGCTTGTGACCATCACCTGCCTTCCAGCTACCCTCGTTGTTG 418 

Qy 320 ACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGG -379 

I I I I I I I I I I I I III I II II II I I I I I I I I I I I I I I I II III I I I II 
Db 419 ACAT CACT GAGACT TGGTTCTTTG GAC AGT CC CT CT GT AAG GT C ATT C CTT ATT TACAGA 478 

Qy 380 CTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATG 439 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 479 CTGTGTCAGTGTCTGTGTCTGTTCTTACGTTGAGCTGCATTGCCTTGGACCGATGGTACG 538 

Qy 440 CCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGG 499 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 539 C CAT TT GT CACCCT T TGATGT T CAAGAGCACAGC CAAAC GGGCT C GAAACAGC AT C GTT G 598 

Qy 500 GCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCA 559 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 599 T CAT CTGGATCGTCT CCT GCAT CATAAT GATTCCT CAAGCCATT GTCAT GGAGT GCAGCA 658 

Qy 560 GTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGG 619 

I - I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 659 GCAT GCT CCCT GGCCTAGCCAATAAGACCACCCTCTTTACAGTAT GTGATGAACACTGGG 718 

Qy 620 CAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTTATTGTCACCTACCTGGCCC 679 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 719 GCGGTGAAGTTTACCCAAAGATGTACCATATCTGCTTCTTTCTGGTGACATACATGGCAC 778 

Qy 680 CACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGA 739 

I III I II III I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 77 9 CTCTGTGTCTTATGATATTGGCTTATCTCCAAATATTCCGTAAACTCTGGTGCCGACAGA 838 

Qy 74 0 TCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGG 7 99 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 839 T T C C C GGAACT T CT T CT GT G GT TC AGAGAAAAT GGAAG CAGC AGCAGCCGGTTT 8 92 

Qy 800 ACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAG 859 

II III i v I II I I I I I II II I I I I I I I I 

Db 893 CTCAGCCCCGGGGGTCCGGACAGCAGAGCAAGGCTCGGATTAGCGCTGTTGCTGCTGAGA 952 

Qy 860 T GAAGCAGATGCGT GCACGGAGGAAGACAGCCAAGATGCTGATGGTGGT GCT GCT GGTCT 919 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I 
Db 953 T AAAGCAGAT C C GAGCAC GAAGGAAAACAGCC CGGAT GCT C AT GGT T GT ACT T CT GGT CT 1012 

Qy 920 TCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGT 979 

I II I I I I I I II II I I I I I I I I I I I I I I I I II I I I I I II II I I I I I I I 
Db 1013 T T GCAAT TT GCT AT CT AC CAAT CAGCAT C CT CAAT GT GCT AAAGAGAGT AT T T G GGAT GT 1072 

Qy 980 TCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGG 1039 



Qy 

Db 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1073 



II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TC ACACACAC GGAAGACAGAGAGACT GT CT AT GCTT G GT T CACT TTT T CT CAT T GG CT T G 



1132 



104 0 TGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTCCTCAGTGGCAAATTCCGGG 1099 

I II I I I I I I I I I I I I I I I I I I II II II II II II I I I I I I I I I I II I 

1133 TATATGCCAACAGTGCTGCAAACCCAATTATTTATAATTTTCTTAGTGGAAAATTTCGAG 1192 

1100 AGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGA 1159 

I I I I I I I I I I I I I I I I I I I I I I I I I I III II I 

1193 AGGAATTTAAAGCTGCCTTTTCTTGTTGTCTTGGGGTTCATCATCGCCAAGGAGACCGCC 1252 

1160 AGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGATGCT 1219 

III III III III I I I I I I I I I I I I I I I I II 

1253 TCGCCAGGGGACGCACGAGCACAGAGAGCAGGAAGTCCCTGACCACACAGATCAGCAACT 1312 

1220 C C GT CT C CAAAAT CT CT GAG CAT GT GGT GCT C AC CAGC GT C AC CACAGT GC 127 0 

II II III I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I 
1313 TT GACAATGT AT CAAAACT CT CAGAGCACGT GGTGCT CACCAGCATAAGCACACTCC 1369 
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AY420887 726 bp DNA linear GSS 17-DEC-2003 

Mus musculus HCRTR1 gene, VIRTUAL TRANSCRIPT, partial sequence, 
genomic survey sequence. 
AY420887 

AY420887. 1 GI : 39776844 
GSS. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 726) 

Clark, A. G., Glanowski, S . , Nielson,R., Thomas, P., Ke jariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D . R. , Lu,F., Murphy, B . , 
Ferriera,S., Wang,G., Zheng, X.H., White, T. J., Sninsky, J. J. , 
Adams, M.D. and Cargill,M. 

Inferring nonneutral evolution from human- chimp-mouse orthologous 
gene trios 

Science 302 (5652), 1960-1963 (2003) 
14671302 

2 (bases 1 to 726) 

Clark, A. G., Glanowski, S . , Nielson,R., Thomas, P., Kejariwal, A. , 
Todd, M. A., Tanenbaum, D.M. , Civello, D. R. , Lu,F., Murphy, B., 
Ferriera,S., Wang,G., Zheng, X.H., White, T . J. , Sninsky, J . J. , 
Adams, M.D. ' and Cargill,M. 
Direct Submission 

Submitted ( 16-NOV-2003) Celera Genomics, 45 West Gude Drive, 
Rockville, MD 20850, USA 

This sequence as made by sequencing genomic exons and ordering them 
based on alignment. 

Location/Qualifiers 
1. .726 

/organism="Mus musculus" 
/mo l_type=" genomic DNA" 
/db_xref="taxon: 10090" 
<1. .>726 



/gene="HCRTRl" 
/locus_tag="HCM7373" 

ORIGIN 

Query Match 45.0%; Score 575.4; DB 29; Length 726; 

Best Local Similarity 87.0%; Pred. No. 8e-105; 

Matches 65.5; Conservative 0; Mismatches 71; Indels 27; Gaps 1; 

Qy 526 AT GGT GCCCCAGGCT GCAGT CATGGAAT GCAGCAGTGT GCT GCCT GAGCT AGCCAACCGC 585 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 1 ATGGTGCCCCAGGCTGCTGTCATGGAGTGCAGCAGCGTGCTGCCTGAGCTAGCCAATCGC 60 

Qy 58 6 ACAC GGCT CTT CT CAGT CT GT GAT GAAC GCT GG G C AGAT GAC CT CT AT C CCAAGAT CT AC 645 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 ACCC G GCT CTT CT CT GT CT GT GAT GAGC ACT G GG CAGAT GAACTCT AC C CCAAGAT CT AT 120 

Qy 646 CACAGTTGCTTCTTTATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTAT 705 

I I I I I I I I II II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 CACAGCTGCTTTTTCATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCTATGGCCTAT 180 

Qy 706 TTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTG 7 65 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I Mill 

Db 181 TTCCAGATCTTCCGCAAGCTCTGGGGCCGCCAGATCCCTGGTACCACATCAGCCTTGGTG 240 

Qy 7 66 CGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAG 825 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I 
Db 241 C GGAACT G GAAACGG CC CT C GGAACAACT G GAGG CT CAGCAC CAGGGC CT CT GT ACAGAG 300 

Qy 826 CCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAG 885 

I I I I I I I I I I I II I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 301 CCCCAGCCCCGGGCCCGAGCCTTCCTGGCTGAGGTGAAGCAGATGCGAGCTCGGAGGAAG 360 

Qy 8 86 ACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGC 945 

II II II I I I I I I I I I I II II I I I I I I I I II II I I I I I II I I II I I I I I II 

Db 361 ACGGCTAAGATGCTGATGGTAGTCCTGCTGGTTTTTGCACTCTGTTATCTGCCCATCAGT 420 

Qy 946 GTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCT 1005 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II Mill 

Db 421 GT C CT CAAT GT CCT T AAGAGAGT GT T C GGGAT GT T CC G C CAAGC CAG C GAC C GGGAAGCC 480 

Qy 1006 GTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCC 1065 

M I II I I M I I II I II I I I I I I I I I I I I I II I I I I I I I I II II II I II MINIM 

Db 4 81 GTCTACGCCTGCTTCACCTTCTCCCACTGGCTAGTGTACGCCAACAGTGCCGCCAACCCT 540 

Qy 1066 ATCATCTACAACTTCCTCAGTGGCAi\ATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGC 1125 

II I II I I I I II M I I II II I II I I II I I II I i I II Mi III I I I II I II II I II I II I I 

Db 541 ATCATCTACAACTTCCTCAGTGGCAAATTCCGGGAGCAGTTCAAGGCTGCCTTCTCCTGC 600 

Qy 1126 TGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCC 1185 

I II II I I I I I I II I I I I II I II I I II I II II 

Db 601 TGCCTGCCTGGTCTGG GTCCCGGCTCCTCTGCC 633 

Qy 1186 AGC CAC AAGT CCT T GT C CTT GCAGAGC C GAT GCT CCGT CT C CAAAAT CTCT GAG CAT GT G 1245 

II II II I II II I M II II I I M I I II I I I II I II I I I I I I II I I I I I I I I II I I I 

Db 634 AGACACAAGTCCTTGTCCTTGCAGAGCCGCTGCTCCGTCTCCAAGGTCTCTGAGCATGTC 693 

Qy 1246 GTGCTCACCAGCGTCACCACAGTGCTGCCCTGA 1278 



Db 



I I I I I I I I I I I I I I I II I I I I I I I I I I I 

694 GT GCT GAC CAC C GT CACT ACC GT GCT GT C CT GA 726 
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AK079572 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 
MEDLINE 
PUBMED 

REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 

TITLE 
JOURNAL 
REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
REFERENCE 
AUTHORS 



AK079572 3153 bp mRNA linear HTC 19-SEP-2003 

Mus mus cuius adult male hypothalamus cDNA, RIKEN full-length 
enriched library, clone :A230091E19 product : OREXIN RECEPTOR TYPE 2, 
full insert sequence. 
AK079572 

AK079572.1 GI: 26348 079 
HTC; CAP trapper. 
Mus mus cuius (house mouse) 
Mus mus cuius 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 

Carninci,P. and Hayashizaki, Y. 

High-efficiency full-length cDNA cloning 

Meth. Enzymol. 303, 19-44 (1999) 

99279253 

10349636 

2 

Carninci,P., Shibata,Y., Hayatsu,N., Sugahara,Y., Shibata,K., 

Itoh,M., Konno,H., Okazaki,Y., Muramatsu,M. and Hayashizaki, Y. 

Normalization and subtraction of cap-trapper-selected cDNAs to 

prepare full-length cDNA libraries for rapid discovery of new genes 

Genome Res. 10 (10), 1617-1630 (2000) 

20499374 

11042159 

3 

Shibata,K., Itoh,M., Aizawa,K., Nagaoka,S., Sasaki, N., 
Konno,H., Akiyama,J., Nishi,K., Kitsunai,T., Tashiro,H. 
Sumi,N., Ishii,Y., Nakamura,S., Hazama,M., Nishine,T., Harada,A. , 
Yamamoto,R., Matsumoto, H. , Sakaguchi , S . , Ikegami,T., Kashiwagi, K. , 
Fujiwake,S., Inoue,K., Togawa,Y., Izawa,M., Ohara,E., Watahiki,M., 
Yoneda,Y., Ishikawa,T., Ozawa,K., Tanaka,T., Matsuura,S., Kawai,J., 
Okazaki,Y., Muramatsu, M. , Inoue,Y., Kira,A. and Hayashizaki, Y. 
RIKEN integrated sequence analysis (RISA) system — 384-format 
sequencing pipeline with 384 multicapillary sequencer 
Genome Res. 10 (11), 1757-1771 (2000) 
20530913 
11076861 
4 

The RIKEN Genome Exploration Research Group Phase II Team and the 
FANTOM Consortium. 

Functional annotation of a full-length mouse cDNA collection 

Nature 409, 685-690 (2001) 

5 

The FANTOM Consortium and the RIKEN Genome Exploration Research 
Group Phase I & II Team. 

Analysis of the mouse transcriptome based on functional annotation 
of 60,770 full-length cDNAs 
Nature 420, 563-573 (2002) 
6 (bases 1 to 3153) 

Adachi,J., Aizawa,K., Akimura, T . , Arakawa,T., Bono,H., Carninci,P., 



Carninci, P . , 
, Itoh,M., 



TITLE 
JOURNAL 



COMMENT 



FEATURES 

source 



CDS 



Fukuda,S., Furuno,M., Hanagaki,T., Hara,A., Hashizume, W . , 
Hayashida, K. , Hayatsu,N., Hiramoto,K., Hiraoka,T., Hirozane,T., 
Hori,F., Imotani,K., Ishii,Y., Itoh,M., Kagawa,I., Kasukawa,T., 
Katoh,H., Kawai,J., Kojima,Y., Kondo,S., Konno,H., Kouda,M., 
Koya,S., Kurihara,C. , Matsuyama, T . , Miyazaki,A., Murata, M. , 
Nakamura,M. , Nishi,K., Nomura, K., Numazaki, R. , Ohno,M., Ohsato,N., 
Okazaki,Y., Saito,R., Saitoh, H., Sakai,C, Sakai,K., Sakazume, N . , 
Sano,H., Sasaki, D., Shibata,K., Shinagawa, A. , Shiraki,T., 
Sogabe,Y., Tagami,M. , Tagawa,A. , Takahashi , F. , Takaku-Akahira, S . , 
Takeda,Y., Tanaka,T., Tomaru,A., Toya,T., Yasunishi , A. , 
Muramatsu,M. and Hayashizaki, Y . 
Direct Submission 

Submitted ( 16-APR-2002 ) Yoshihide Hayashizaki, The Institute of 
Physical and Chemical Research (RIKEN) , Laboratory for Genome 
Exploration Research Group, RIKEN Genomic Sciences Center (GSC) , 
RIKEN Yokohama Institute; 1-7-22 SueHiro-cho, Tsurumi-ku, Yokohama, 
Kanagawa 230-0045, Japan ( E-mail : genome-res@gsc .riken . go . jp, 
URL :http: //genome. gsc. riken. go. jp/, Tel : 81-45-503-9222, 
Fax:81-45-503-9216) 

cDNA library was prepared and sequenced in Mouse Genome 
Encyclopedia Project of Genome Exploration Research Group in Riken 
Genomic Sciences Center and Genome Science Laboratory in RIKEN. 
Division of Experimental Animal Research in Riken contributed to 
prepare mouse tissues. 

Please visit our web site for further details. 
URL:http: //genome . gsc . riken. go. jp/ 
URL:http: //f antom. gsc . riken. go. jp/ . 

Location/Qualifiers 

1. .3153 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6J" 
/db_xref="FANTOM_DB:A230091E19" 
/ db_xr e f = "MGI :2403517" 
/db_xref="taxon: 10090" 
/clone="A230091E19" 
/sex-"male" 

/tissue__type="hypothalamus " 

/clone_lib="RIKEN full-length enriched mouse cDNA library" 

/dev_stage="adult" 

108. .1202 

/note="unnamed protein product; OREXIN RECEPTOR TYPE 2 
(SWISSPROTI P56719, evidence: FASTY, 98.5SID, 100%length, 
match=1380) 
putative " 

/ codon_start=l , -.^ 

/protein_id="BAC37688 . 1" 
/db_xref="GI: 26348080" 

/translation="MSSTKLEDSLSRRNWSSASELNETQEPFLNPTDYDDEEFLRYLW 
REYLHPKEYEWLIAGYIIVFWALIGNVLVCVAWKNH™^ 

VTITCLPATLWDITETWFFGQSLCKVIPYLQTVSVSVSVLTLSCIALDRWYAICHPL 
MFKSTAKRARNSIWIWIVSCIIMIPQAIVMECSSMLPGLANKTTLFTVCDEHWGGEV 
YPmYHICFFLVTYMAPLFLMILAYLQIFRKLWCRQIPGTSSWQRKWKQQQPVSQPR 
GS GQQ S KARVS AVAAE I KQ I RARRKTARMLMVVLLVFAI CYLPI S I LNVLKRVFGMFT 
HTEDRETVYAWFTFPHWLVYANSCCKPNYL" 



ORIGIN 



Query Match 44.4%; Score 567.6; DB 11; Length 3153; 

Best Local Similarity 68.9%; Pred. No. 5.7e-103; 

Matches 826; Conservative 0; Mismatches 359; Indels 13; Gaps 3; 

Qy 80 AT GAAGAT GAGT T T C T C C GCT AT CT GT GGC GT GAT TAT CT GT AC C C AAAA.C AGT AT GAGT 139 

I M M M M II II II I I I I I I I II I I M I I I I IN I I I I I I I I 
Db 211 ACGACGAGGAATT CCTGC GGTACCT GT GGAGGGAATACCT ACACCCGAAAGAATATGAGT 270 

Qy 140 GGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCCCTGGTGGGCAACACGCTGG 199 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 271 GGGTCCTGATCGCAGGGTATATCATCGTGTTCGTTGTGGCTCTCATCGGGAACGTCCTGG 330 

Qy 2 00 TCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTCACCAACTACTTCATTGTCA 259 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 331 T CT GT GT GGC AGT GT GGAAGAAC CAC CACAT GAGGACAGT CACCAACTACT T CAT AGT CA 390 

Qy 260 ACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGG 319 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 

Db 391 ACCTTTCCCTAGCAGATGTGCTTGTGACCATCACCTGCCTTCCAGCTACCCTCGTTGTTG 4 50 

Qy 320 ACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGG 379 

I I I I I I I I I I I I III I II II II I I I I I I I I I I I I I I I II III I I I I I 
Db 451 ACATCACTGAGACTTGGTTCTTTGGACAGTCCCTCTGTAAGGTCATTCCTTATTTACAGA 510 

Qy 380 CTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATG 439 

I I I I I I I I I I I I III I II II II I I II I III III I I I I I I I I I I I I I 
Db 511 CTGTGTCAGTGTCTGTGTCTGTTCTTACGTTGAGCTGCATTGCCTTGGACCGATGGTACG 570 

Qy 44 0 CCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGG 499 

I I I II II II I I I I II II I I I I I I II I I I I I I I I II I I I I I I II 
Db 571 C CAT T T GT CAC C CT TT GAT GT T CAAGAGCACAGC CAAAC GGGCT C GAAACAGCAT CGTT G 630 

Qy 500 GCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCA 559 

I I I I I I I I II I I I II I II I II II I I I I I I I I I I I I I I I I 

Db 631 T CAT CTGGAT CGTCT CCT GCATCATAAT GATTCCT CAAGCCATTGT CAT GGAGTGCAGCA 690 

Qy 560 GTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGG 619 

I I I I II II I I I I I II I I II I I I I I I I I M I I I I I I I I I I I I I I 

Db 691 GCAT GCT C CCT GGCC TAGC CAATAAGAC CAC C CT CTTT ACAGTAT GTGAT GAACACT GG G 750 

Qy 620 C AGAT GAC CT CT AT C C C AAGAT CT AC CAC AGT T GCT T CT TT AT T GT CAC CT AC CT G GC C C 679 

M I I II II II I I I II II I I I I I I I I I I I II I II I I I II I I II 
Db 751 GC GGT GAAGTT TAC CCAAAGAT GT ACCATAT CT GC TT C T TT CT GGT GAC AT AC AT GGCAC 810 

Qy 68 0 CACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGA 739 

I lit 1 M Ml I I I I III I I I I II I I I I I I II I I I I I I I I II I I I I 

Db 811 CTCT GTTT CTTATGATAT T GGCTTATCT CCAAAT ATTCCGTAAACTCTGGT GCCGACAGA 870 

Qy 74 0 TCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGG 799 

I Mill II I II I I II I II I I I I I I I I I I I M I 

Db 871 TTCCCGGAACTTCTTCTGTGGTTCAGAGAAAATGGAAGCAGC AGCAGCCGGTTT 924 

Qy 800 ACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAG 859 

II II II I II II II I I II I I I I I I I I I I 

Db 925 CTCAGCCCCGGGGGTCCGGACAGCAGAGCAAGGCTCGGGTTAGCGCTGTTGCTGCTGAGA 98 4 

Qy 860 TGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCT 919 



Db 



985. 



I I I I I I I I I II I I I I I Mill I I I I I I | | | | I I I I I I I II II I II I I I I 

TAAAGCAGATCCGAGCACGAAGGAAAACAGCCCGGATGCTCATGGTTGTACTTCTGGTCT 



1044^ 



Qy 920 TCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGT 979 

I II I I I I I I M II MINI I I I I I I I I I I I I I I I I I II II I I I I I I I 
Db 1045 T T GCAAT T T GCTAT CT AC CAAT CAGCAT CCT CAAT GT G CT AAAGAGAGTAT T TGGGAT GT 1104 

Qy 980 TCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGG 1039 

M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 1105 TCACACACACGGAAGACAGAGAGACTGTCTATGCTTGGTTCACTTTTCCTCATTGGCTTG 1164 

Qy 1040 TGTACGCCAACAGC-GCTGCCAACCCCATCATCTACAACTTCCTCAGTGGCAAATTCCGG 1098 

I M M I I I I I I I I I I II I I I I I II II II II II II I II I I I I I I I I I 
Db 1165 TAT AT GC CAAC AGCT GCT GCAAAC C CAAT TAT TT AT AAT TTT CTT AGT GGAAAATT T C GA 1224 

Qy 1099 GAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTG 1158 

I I I I I I I I I I I I I I I I I I I I I I I I I I I III || | 

Db 1225 GAGGAATTTAAAGCTGCCTTTTCTTGTTGTCTTGGGGTTCATCATCGCCAAGGAGACCGC 1284 



Qy 1159 AAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGATGC 1218 

I I I Ml III III I I I I I I I I I I I I I I I I I 

Db 128 5 CT CGC CAGGGGAC GCAC GAGC ACAGAGAGCAGGAAGT C C CT GACC ACAC AGAT CAGCAAC 1344 

Qy 1219 T C C GT CT C C AAAAT CT CT GAGC AT GT GGT GCT C ACC AGC GT C AC CAC AGT G C 1270 

I M M IN I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 134 5 T TT GACAAT GT AT C AAAACT CT CAGAGC ACGT GGT GCT CAC CAGC ATAAGCACACT CC 14 02 



RESULT 11 

BC035858 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



BC035858 1790 bp mRNA linear HTC 04-MAR-2003 

Homo sapiens, Similar to hypocretin (orexin) receptor 2, clone 
IMAGE: 5767576, mRNA. 
BC035858 

BC035858. 1 GI : 23959160 
HTC . 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1790) 
Strausberg, R. 
Direct Submission 

Submitted ( 31- JUL-2002 ) National Institutes of Health, Mammalian 
Gene Collection (MGC) , Cancer Genomics Office, National Cancer 
Institute, 31 Center Drive, Room 11A03, Bethesda, MD 20892-2590, 
USA 

NIH-MGC Project URL: http://mgc.nci.nih.gov 

Contact: MGC help desk 

Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Life Technologies, Inc. 

cDNA Library Preparation: Life Technologies, Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: National Institutes of Health Intramural 

Sequencing Center (NISC) , 

Gaithersburg, Maryland; 

Web site: http://www.nisc.nih.gov/ 



Contact: niscjngc@nhgri . nih. gov • 

Akhter,N., Ayele,K., Beckstrorn-Sternberg, S .M. , Ben j amin, B . , 
Blakesley, R.W. , Bouf f ard, G. G. , Breen,K., Brinkley,C, Brooks, S., 
Dietrich,N.L., Granite, S., Guan,X., Gupta, J., Haghighi,P., 
Hansen, N . , Ho,S.-L., Karlins,E., Kwong,P., Laric,P., Legaspi,R., 
Maduro,Q.L., Masiello,C, Maskeri,B., Mastrian, S . D. ,McCloskey, J. C. , 
McDowell, J., Pearson, R. , Stantripop, S . , Thomas, P. J., Touchman, J. W. , 
Tsurgeon,C, Vogt,J.L., Walker, M. A. , Wetherby, K. D. , Wiggins, L., 
Young, A. , Zhang, L . -H . and Green, E.D. 



FEATURES 

source 



ORIGIN 



Clone distribution: MGC clone distribution information can be found 
through the I.M.A.G.E. Consortium/LLNL at: http://image.llnl.gov 
Series: IRAK Plate: 79 Row: p Column: 14 

This clone was selected for full length sequencing because it 
passed the following selection criteria: matched mRNA gi: 6006037 
This clone has the following problem: retained intron. 

Location/Qualifiers 

1. .1790 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone=" IMAGE: 5767576" 

/tissue_type="Brain, fetal, whole pooled" 
/clone JLib="NIH_MGC_121" 
/labJiost="DH10B" 
/note="Vector: pCMV-SPORT6" 



Query Match 40.8%; 
Best Local Similarity 70.3%; 
Matches 714; Conservative - 



Score 521; DB 11; 
Pred. No. 9.5e-94; 
0; Mismatches 295; 



Qy 



Db 



Length 17 90; 
Indels 6; Gaps 



i; 



8 0 AT GAAGATGAGTTT CTCCGCTAT CTGT GGCGTGATT ATCTGTACCCAAAACAGTAT GAGT 139 
I M I I II II II II || I I I I I I | || | | Ml I I I I III I I I I I I I I 
14 6 AC GAC GAGGAAT T C CT GC G GT AC CTGT GGAGG GAAT ACCT GCAC CC GAAAGAAT AT GAGT 205 



Qy 

Db 

Qy 

Db 



140 GGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCCCTGGTGGGCAACACGCTGG 199 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

206 GGGTCCTGATCGCCGGGTACATCATCGTGTTCGTCGTGGCTCTCATTGGGAACGTCCTGG 265 

200 TCT GCCTGGCCGTGT GGC GGAACCAC CACAT GAGGACAGT CACCAACTACTT CATT GT CA 259 

I M MM II I I II I I M I I II II I II II II I II I I I II I II I I M I I II II 

266 TT T GTGT GGC AGT GT GGAAGAAC CAC CACAT GAGGAC GGTAAC CAACTACTT CATAGT CA 325 



Qy 



Db 



260 ACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGG 319 

I M M M I II I II II M -N III I I I I II I II I I I I I I I I I I I I 

326 ATCTTTCTCTGGCTGATGTGCTCGTGACCATCACCTGCCTTCCAGCCACACTGGTCGTGG 385 



Qy 

Db 

Qy 

Db 



320 ACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGG 379 

I I M M II I I II II I | I I II I II I II I II I II 

386 AT AT CACT GAGACCTGGT TT T TT GGACAGT C CCT T T GCAAAGT GATT CCT T AT CT ACAGA 445 

380 CTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATG 439 

I I M II Ill I II I I I M | | I I I I I II I II II II II I 

446 CCGTGTCGGTGTCTGTGTCTGTCCTCACACTGAGCTGTATCGCCTTGGATCGGTGGTATG 505 



Qy 



440 CCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGG 499 



I I I I I I Mill I I I I I I I I I I I I I I I I MINIMI I Ml I 

Db 506 CAAT CT GT C AC C CTT T GAT GTT T AAGAGCACAGCAAAGC G GGC C CGTAACAGC ATT GT CA 565 

Qy 500 GCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCA 559 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | 

Db 566 TCATCTGGATTGTCTCCTGCATTATAATGATTCCTCAGGCCATCGTCATGGAGTGCAGCA 625 

Qy 560 GTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGG 619 

I I I I I I I I I I I I I I II I II I I I I I I I I I I I | | | | | | | | | 

Db 626 CCGTGTTCCCAGGCTTAGCCAATAAAACCACCCTCTTTACGGTGTGTGATGAGCGCTGGG 685 

Qy 620 CAGAT GACCT CTAT C C CAAGAT CTAC CACAGTT GCT T CTT TAT T GT CAC CT AC CT GGC CC 679 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | 
Db 686 GT GGTGAAATTTAT CCCAAGATGTAC CACAT CT GTTTCTTT CTGGTGACATACAT GGCAC 745 

Qy 680 CACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGA 739 

Mill I I I I I I II MM III I II I I I I I I I I I I I I M I I I II MM 

Db 746 CACTGTGTCTCATGGTGTTGGCTTATCTGCAAATATTTCGCAAACTCTGGTGTCGACAGA 805 

Qy 740 TCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGG 799 

M M I I I I I I I I I I I I II I I I II II I I I I I I I 

Db 806 T C C CT GGAACAT C AT CT GT AGTT CAGAGAAAAT GGAAGC C C C TGCAGCCTGTTT 859 

Qy 800 ACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAG 859 

M I M I I I I I I II II II II I I I II II I 

Db 860 CACAGCCTCGAGGGCCAGGACAGCCAACGAAGTCCCGGATGAGCGCTGTGGCGGCTGAAA 919 

Qy 860 T GAAGCAGATGCGT GCACGGAGGAAGACAGCCAAGATGCTGATGGTGGTGCTGCT GGT CT 919 

I I I M I I II II II I I I II I II I I I I II II II I II I I I || M II II I 
Db 920 TAAAGCAGAT C CGAGC CAGAAGGAAAACAGC C C GGAT GTT GAT GGTT GT GCT TT T GGT AT 979 

Qy 920 TCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGT 979 

I M I Mill II II II II I I I I I I II I II M I I II I M M I I I I II I 
Db 980 TTGCAATTTGCTATCTACCAATTAGCATCCTCAATGTGCTAAAGAGAGTATTTGGGATGT 1039 

Qy 980 TCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGG 1039 

I Ml I I M I II I I I I I I I I I M I I I M I I I I M M I I I I I 

Db 1040 TTGCCCATACTGAAGACAGAGAGACTGTGTATGCCTGGTTTACCTTTTCACACTGGCTTG 1099 



Qy 1040 T GT AC GC CAAC AGC GCT GCCAAC C C CAT CAT CTACAACTT C CT C AGT GGCAAAT T 1094 

I M I I I I I II I I I II II II II II II II II II I I I I I I I II 
Db 1100 TATATGCCAATAGTGCTGCGAATCCAATTATTTATAATTTTCTCAGTGGTGAGTT 1154 



RESULT 12 
AL535836- 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 



AL535838 1001 bp mRNA linear EST 12-MAY-2003 

AL535838 Homo sapiens FETAL BRAIN Homo sapiens cDNA clone 
CS0DF013YE04 5-PRIME, mRNA sequence. 
AL535838 

AL535838 .2 GI : 3054275# 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 
1 (bases 1 to 1001) 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. 



AUTHORS 
TITLE 
JOURNAL 
COMMENT 



FEATURES 

source 



Li,W.B., Gruber,C, Jessee, J. and Polayes,D. 
Full-length cDNA libraries and normalization 
Unpublished (2001) 

On Feb 13, 2001 this sequence version replaced gi: 12799331. 
Contact: Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

Email: seqref@genoscope.cns.fr, Web : www.genoscope.cns.fr 
Library was constructed by Life Technologies, a division of 
Invitrogen. This sequence belongs to sequence cluster 151. r For 
more information about this cluster, see 
http : / /www. genoscope . ens . f r/ 

cgi-bin/cluster. cgi?seq=CSODF013BC02QPl&cluster=151 .r. Contact : 
Feng Liang Email : fliang@lifetech.com URL : 

http://fulllength.invitrogen.com/ InVitroGen Corporation 1600 
Faraday Avenue Genoscope sequence ID : CS0DF013BC02QP1 . 

Location/ Qualifiers 

1. .1001 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="CS0DF013YE04" 
/tissue_type="FETAL BRAIN" 
/ dev_stage=" fetal" 

/clone_lib="Homo sapiens FETAL BRAIN" 

/note="0rgan: brain; Vector: pCMVSP0RT_6; 1st strand cDNA 
was primed with a Notl-oligo (dT) primer. Five prime end 
enriched, double-strand cDNA was digested with Not I and 
cloned into the Not I and EcoRV sites of the pCMVSPORT 6 
vector. Library was not normalized." 



ORIGIN 



Query Match 36.8%; Score 470.4; DB 9; Length 1001; 

Best Local Similarity 91.9%; Pred. No. le-83; 

Matches 543; Conservative 13; Mismatches 26; Indels 9; 



Gaps 



6; 



Qy 



Db 



377 AGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGT 436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I : I I I I I I I I I I I I I 
415 AGGCTGTGTCCGTGTCANTGGCAGTGCTAACTCTMANCTTCATCGCMCTGGACCGCTGGT 474 



Qy 



Db 



437 ATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCC 4 96 

I I I I : I I I I I I : I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
475 ATGCHATCTGCYACCCACTATTGTTCAAGARCACAGCCCGGCGGGCCCGTGGCTCCATCC 534 



Qy 

Db 

Qy 
Db 

Qy 

Db 



497 TGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCA 556 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I iTI I I I I I I 

535 TNNGNATCTGGGCTNTNTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCA 594 

557 GCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCT 616 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
595 GCAGTGTGCTGCCTNAGCTANCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCT 654 



617 



67 l 6 



GGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTTATTGTCACCTACCTGG 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

655 GGGCAGAT GACCT CT AT C C CAAGAT CT AC CACAGT T GCT T CT T TAT T GT C AC CT AC CT GG 714 



Qy 



677 CCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCC 736 



Db 



715 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCC 774 



Qy 737 AGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCT-G 795 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I I I I | I I I 
Db 775 AGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGG 834 

Qy 796 GGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCT 855 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III II I I I I I I I II I I I I I I I 

Db 835 GGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCC — AGCCCCGGGCCGCGCCTTCCTGGCT 892 

Qy 856 GAAGT GAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTGATGGTGGT GCT GCT G 915 

I I I I M I : I I II I : I I I I I : I : I I I I I I I I I I I I II I I : I I I I I I I 

Db 893 GAAGT GA RCAGAT G S T GCAG GC AGVAGAC AS C S AAGATG CT GAT GGT GGB GCT GCTG 949 

Qy 916 GTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGG 966 

: II I I I I 1111111111:11 1111111:11 I I I I I I II I I I I I I I I 
Db 950 STCTTCG-CCTCTGCTACSTG-CCATCAGSGT-CTCAATGTCTTAAAGAGG 997 



RESULT 13 

BQ269289 

LOCUS 

DEFINITION 



ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 
, JOURNAL 
COMMENT 



GI:20494355 



BQ269289 520 bp mRNA linear EST 15-JUL-2003 

ik23fl2.yl -HR85 islet Homo sapiens cDNA clone IMAGE : 5782030 5' 
similar to SW: 0X1R_HUMAN 043613 OREXIN RECEPTOR TYPE 1 ;, mRNA 
sequence . 
BQ269289 
BQ269289.1 
EST. 

Homo sapiens 
Homo sapiens 
Eukaryota; Metazoa; 
Mammalia; Eutheria; 
1 (bases 1 to 520) 
Melton, D., Brown, J. 



(human) 



Chordata; 
Primates ; 



Craniata; Vertebrata; Euteleostomi; 
Catarrhini; Hominidae; Homo. 



Kenty,G., Permutt,A., Lee,C. 



Tsagareishvili, R. 



& Hiroshi Inoue 



FEATURES 



Kaestner, K. , 

Lemishka,I., Scearce,M., Brestelli, J. , Gradwohl,G., Clifton, S., 
Hillier,L., Marra,M., Pape,D., Wylie,T., Martin, J., Blistain,A. , 
Schmitt,A. , Theising,B., Ritter,E., Ronko,I., Bennett, J., 
Cardenas, M. , Gibbons, M. , McCann,R., Cole,R., 
Williams, T., Jackson, Y. and Bowers, Y. 
Endocrine Pancreas Consortium 
Unpublished (2000) 

Contact: Douglas Melton, Klaus H. Kaestner, 
Endocrine Pancreas Consortium 

Harvard University, Howard Hughes Medical Institute 

Dept of Molecular and Cellular Biology, 7 Divinity Ave, Cambridge/ 

MA 02138 

Tel: 617-495-1812 
Fax: 617-495-8557 

Email: dmelton@biohp.harvard.edu 

Library was constructed by Dr. Hiroshi Inoue DNA sequencing by: 
Washington University Genome Sequencing Center For information on 
obtaining a clone please contact: Dr. Hiroshi Inoue 
(hinoue@im. wustl . edu) 
Seq primer: -40RP from Gibco 
High quality sequence stop: 426. 
Location/Qualifiers 



source 1. .520 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/clone="IMAGE: 5782030" 

/tissue_type="Purified pancreatic islet" 

/lab_host= M DH10B" 

/clone_lib="HR85 islet" 

/note="Organ: Pancreas; Vector: pBluescript SK(-); Site_l: 
NotI; Site_2 : Xhol; cDNA made by oligo-dT priming. 
Size-selected on agarose gel. Average insert size ~lkb. 5' 
Xhol site was destroyed after directional cloning. 
Amplified once. Contact information: Hiroshi Inoue, MD, 
Metabolism Div. (Alan Permutt Lab), Washington University 
School of Medicine, Box 8127, 660 South Euclid Ave., St. 
Louis, MO 63110, E-mail: hinoue@imgate.wustl.edu, Tel: 
314-362-1916, Fax: 314-747-2692." 

ORIGIN 



Query Match 36.7%; Score 468.4; DB 13; Length 520; 

Best Local Similarity 99.8%; Pred. No. 1.9e-83; 

Matches 469; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


809 


AGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGA 


868 






1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 I 1 I I I I I I I I I I I I || | | M 




Db 


1 


AGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGA 


60 


Qy 


869 


TGCGTGCACGGAGGAAGACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCT 


928 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


TGCGTGCACGGAGGAAGACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCT 


120 


Qy 


929 


GCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAG 


988 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


GCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAG 


180 


Qy 


989 


CCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCA 


1048 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


CCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCA 


240 


Qy 


1049 


ACAGCGCTGCCAACCCCATCATCTACAACTTCCTCAGTGGCAAATTCCGGGAGCAGTTTA 


1108 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


241 


ACAG CGCT GC CAAC CC C AT CAT CT ACAACT T C CT C AGT GGCAAAT T C C GGGAGC AGTT TA 


300 ( 


Qy 


1109 


AGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGAAGGCCCCTA 


1168 






1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | j || | | | 




Db 


301 


AGGCTGCCTTCTCCT-GCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTCTGAAGGCCCCTA 


360, > 


Qy 


1169 


GTCCCCGCTCCTCTGCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGATGCTCCGTCTCCA 


1228 






1 1 1 1 1 1 1 I 1 1 I 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 I 1 1 1 1 MINI 




Db 


361 


GTCCCCGCTCCTCTGCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGATGCTCCATCTCCA 


420 


Qy 


1229 


AAATCTCTGAGCATGTGGTGCTCACCAGCGTCACCACAGTGCTGCCCTGA 1278 








1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


421 


AAATCTCTGAGCATGTGGTGCTCACCAGCGTCACCACAGTGCTGCCCTGA 470 





RESULT 14 



BX409735 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



Chordata ; 
Primates ; 



Craniata; Vertebrata; Euteleostomi ; 
Catarrhini; Hominidae; Homo. - 



BX409735 892 bp mRNA linear EST 13-MAY-2003 

BX409735 Homo sapiens FETAL BRAIN Homo sapiens cDNA clone 
CS0DF013YE04 5-PRIME, mRNA sequence. 
BX409735 

BX409735. 1 GI : 30652 997 
EST. 

Homo sapiens (human) 
Homo sapiens 
Eukaryota; Metazoa; 
Mammalia; Eutheria; 
1 (bases 1 to 892) 

Li,W.B., Gruber,C, Jessee,J. and Polayes,D. 
Full-length cDNA libraries and normalization 
Unpublished (2001) 
Contact: Genoscope 

Genoscope - Centre National de Sequencage 
BP 191 91006 EVRY cedex - France 

Email: seqref@genoscope.cns.fr, Web : www.genoscope.cns.fr 
Library was constructed by Life Technologies , a division of 
Invitrogen. This sequence belongs to sequence cluster 151. r For 
more information about this cluster, see 
http://www.genoscope.cns.fr/ 

cgi-bin/cluster.cgi?seq=CS0BAF012ZE07_AF01110_l&cluster=151. r. 
Contact : Feng Liang Email : fliang@lifetech.com URL : 
http://fulllength.invitrogen.com/ InVitroGen Corporation 1600 
Faraday Avenue Genoscope sequence ID : CS0BAF012ZE07__AF01110_1 . 

Location/Qualifiers 

1. .892 

/organism="Homo sapiens" 
/mol_type= "mRNA" 
/db_xref="taxon:9606" 
/clone="CS0DF013YE04" 
/tissue_type="FETAL BRAIN" 
/dev_stage=" fetal" 

/clone_lib="Homo sapiens FETAL BRAIN" 

/note="0rgan: brain; Vector: pCMVSP0RT_6; 1st strand cDNA 
was primed with a Notl-oligo (dT) primer. Five prime end 
enriched, double-strand cDNA was digested with Not I and 
cloned into the Not I and EcoRV sites of the pCMVSPORT 6 
vector. Library was not normalized." 



ORIGIN 



Query Match 34.2%; 
Best Local Similarity 99.8%; 
Matches 438; Conservative 



Score 437.4; DB 13; Length 892; 
Pred. No. 3.9e~77; 
0; Mismatches 1; Indels 0; Gaps 



0; 



Qy 



Db 



840 CCGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCT 899 
I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1 CCGCGCCTTCCT GGCT GAAGT GAAG CAGATGC GT GCAC G GAGGAAGAC AGCCAAGAT G CT 60 



Qy 



Db 



900 GATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCT 959 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ni 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

61 GATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCT 120 



Qy 



960 TAAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTT 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



1019 



Db 



121 TAAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTT 180 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1020 CACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTT 1079 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 CACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTT 240 

1080 CCTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCT 1139 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

241 CCTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCT 300 

114 0 GGGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTT 1199 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I 

301 GGGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTT 360 

1200 GT C CT T GCAGAG CC GAT G CT C C GT CT CCAAAAT CT CT GAGCAT GT GGT G CTC ACCAGC GT 1259 
I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
361 GT CCT T GCAGAGC C GAT GCT C CAT CT CCAAAAT CT CTGAGC AT GT GGT GCT CAC C AGC GT 420 

1260 CACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I n 

421 CACCACAGTGCTGCCCTGA 439 



RESULT 15 

BM926746 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BM92674 6 993 bp mRNA linear EST 12-MAR-2002 

AGENCOURT_6681991 NIH_MGC_121 Homo sapiens cDNA clone IMAGE: 5767576 
5 f , mRNA sequence. 
BM926746 

BM92 674 6. 1 GI: 19377125 
EST. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 993) 

NIH-MGC http : / /mgc . nci . nih . gov/ . 

National Institutes of Health, Mammalian Gene Collection (MGC) 

Unpublished (199 9) 

Contact: Robert Strausberg, Ph.D. 

Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Life Technologies, Inc. 
cDNA Library Preparation: Life Technologies, Inc. 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : / / image . llnl . gov 



FEATURES 

source 



Plate: LLAM12826 row: a column: 
High quality sequence stop: 684. 

Location/Qualif iers 

1. .993 

/organism= ,f Homo sapiens" 
/mol_type="mRNA" 
/db_xref= M taxon: 9606" 
/ clone= " IMAGE : 57 67 57 6 " 
/lab host="DH10B" 



17 



/clone_lib="NIH_MGC_121" 

/note="Organ: brain; Vector: pCMV-SP0RT6; Site_l: NotI; 
Site_2: EcoRV (destroyed); RNA source anonymous pool of 3 
fetal brains , female age 20 weeks, female age 24 weeks, 
and male age 26 weeks. Library is oligo-dT primed and 
directionally cloned (EcoRV site is destroyed upon 
cloning). Average insert size 1.7 kb, insert size range 
0.7-3.5 kb. Library is normalized and enriched for 
full-length clones and was constructed by C. Gruber 
(Invitrogen) . Research Genetics tracking code 017. Note: 
this is a NIH_MGC Library." 



ORIGIN 



Query Match 30.8%; 
Best Local Similarity 72.1%; 
Matches 512; Conservative 



Score 393.2; DB 12; 
Pred. No. 2.9e-68; . 
0; Mismatches 198; 



Length 993; 
Indels 0; 



Gaps 



0; 



Qy 



Db 



80 AT GAAGAT GAGT TT CT CC GCT AT CT GT GGC GT GAT TAT CT GT AC C CAAAACAGT AT GAGT 139 

I II II II II II II II I I I I I I I II II III I I I I Ml I I I I I I I I 

145 AC GAC GAGGAATT CCT GC GGT AC CTGT GGAGGGAATACCT GC AC C C GAAAGAAT AT GAGT 204 



Qy 

Db 

Qy 

Db 



140 



205 



200 



GGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCCCTGGTGGGCAACACGCTGG 
I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GGGTCCTGATCGCCGGGTACATCATCGTGTTCGTCGTGGCTCTCATTGGGAACGTCCTGG 



199 



264 



259 



TCTGCCTGGCCGT GT GGC GGAAC C AC CACAT GAGGACAGT CAC CAACT ACTT CATT GTCA 
I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
265 TT T GT GT GGCAGT GT GGAAGAAC CAC CACAT GAGGAC GGT AAC CAACT ACTT C ATAGTC A 324 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 



260 



325 



320 



ACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATCTTTCTCTGGCTGATGTGCTCGTGACCATCACCTGCCTTCCAGCCACACTGGTCGTGG 



ACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGG 
I I I I I I I II I I I I I I I II II II I I II I I I I I II II II I I I I I I I I I 
385 ATATCACTGAGACCT GGTTTTTT GGACAGT CCCTTTGCAAAGT GATTCCTTAT CTACAGA 



380 



445 



440 



505 



500 



565 



560 



625 



CTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATG 

I I I I I I I I I I I III I II II II II I I I I I I I I I I I I I I II I I I I I I I 

CCGTGTCGGTGTCTGTGTCTGTCCTCACACTGAGCTGTATCGCCTTGGATCGGTGGTATG 

CCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGG 
I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I 
CAAT CT GT CAC C CTT T GAT GTT T AAGAGCACAGCAAAGCGGGC CC GTAACAGCATT GT CA 



319 



384 



379 



444 



439 



504 



499 



564 



GCATCTGGoCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCA' 559 

I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I 

TCATCTGGATTGTCTCCTGCATTATAATGATTCCTCAGGCCATCGTCATGGAGTGCAGCA 624 

GTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGG 619 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

CCGTGTTCCCAGGCTTAGCCAATAAAACCACCCTCTTTACGGTGTGTGATGAGCGCTGGG 684 



Qy 

Db 



620 
685 



C AGAT GAC CT CTAT C C CAAGAT CT AC CACAGT TGCTTCTT TAT T GT CAC CT AC CT GGC C C 67 9 

I Ml I I I I I I I I I I I I I I I I I I I II 111111111 II III I I I I I 
GT GGT GAAATTTAT CC CAAGAT GT AC CACAT CT GT TTCTTTCTGGT GACATACAT GGCAC 74 4 



Qy 680 CACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGA 739 

I I I I I I I I I I I I I 1 I I I I I I ! i I I I M I I I I I I I I I I II I I I I I I i 
Db 74 5 CACT GT GTCT CAT GGT GTT G GCT TAT CT GCAAAT AT T T C G CAAACT CT G GT GT C GACAGA 804 

Qy 74 0 TCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGAC 789 

I I I I I I I I I I I I I I I I I I I I I I I III III 
Db 805 TCCCT GGAACAT CAT CT GT AGTT CAGAGAAAATGGAAAGCCCCTGGAGCC 854 



Search completed: October 15, 2004, 22:50:27 
Job time : 3725.85 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 

Perfect score: 
Sequence : 



October 15, 2004, 13:54:41 ; Search time 5178.06 Seconds 

(without alignments) 
10697.520 Million cell updates/sec 

US-10-070-532-1 
1278 

1 atggagccctcagccacccc tcaccacagtgctgccctga 1278 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 3470272 seqs, 21671516995 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



6940544 



Database : GenEmbl:* 



1 




gb ba : * 


2 




gb htg : * 


3 




gb in:* 


4 




gb om: * 


5 




gb_ov: * 


6 




gb pat:* 


7 




gb_ph : * 


8 




gb pi : * 


9 




gb pr : * 


10: 


gb ro : * 


11: 


gb sts:* 


12: 


gb sy:* 


13: 


gb un : * 


14 


gb vi : * 


15 


em ba : * 


16 


em fun : * 


17 


em hum: * 


18 


em in : * 


19 


em mu : * 


20 


em om : * 


21 


em or : * 


22 


em_ov : * 


23 


em pat : * 


24 


: em ph : * 


25 


: em pi > * 


26 


: em ro : * 


27 


: em sts : * 



28 


em un \ * 


29 


em vi \ * 


30 


C-lll 11 uy 11 LULL. 


31 


cm iiL-y ±n v • 




C-llL nuy U Lilcl . 


o o 


ciii iiuy HlUS • 


^4 

o *± 


cm liuy ^JXIl • 




cm iii~y j_ wLi . 


36 


em htcr mam* * 


37 


em htg vrt : * 


38 


em s y : * 


39 


em htgo hum: * 


40 


em htgo mus : * 


41 


em htgo other:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 



To . 


Score 


Match Length 


DB 


ID 


Description 


1 


1274 . 8 


99. 


7 


1564 


6 


E43974 


E43974 Novel G pro 


2 


1274 . 8 


99. 


7 


1564 


6 


E50810 


E50810 Novel G pro 


3 


1274 . 8 


99. 


7 


1564 


6 


E50811 


E50811 Novel G pro 


4 


1274 . 8 


99. 


7 


1564 


6 


AX299473 


AX299473 Sequence 


5 


1274 . 8 


99. 


7 


1564 


6 


AX299475 


AX299475 Sequence 


D 


10 7/1 P 


99. 


7 


1564 


D 


j\2\D t ± y uoz 


/\AOfiyuoz sequence 


7 


1274.8 


99. 


7 


1564 


6 


AX746121 


AX746121 Sequence 


8 


1274.8 


99. 


7 


- 1564 


6 


AX840912 


AX840912 Sequence 


9 


1274. 8 


99. 


7 


1564 


9 


AF041243 


AF041243 Homo sapi 


10 


1270 


99. 


4 


1278 


6 


AX280925 


AX280925 Sequence 


11 


1205. 8 


94. 


4 


1209 


6 


AR216117 


AR216117 Sequence 


12 


1201 


94. 


0 


1209 


6 


BD185452 


BD185452 Human neu 


13 


1086.4 


85. 


0 


1133 


6 


E43973 


E43973 Novel G pro 


14 


1086.4 


85. 


0 


1133 


6 


AX746120 


AX746120 Sequence 


15 


1086.4 


85. 


0 


1170 


6 


E43972 


E43972 Novel G pro 


16 


1086.4 


85. 


0 


1170 


6 


AX746118 


AX746118 Sequence 


17 


1085.8 


85. 


0 


1110 


6 


AR216118 


AR216118 Sequence 


18 


1083.2 


84. 


8 


1116 


6 


AR216119 


AR216119 Sequence 


19 


1083.2 


84. 


8 


1133 


6 


BD185454 


BD185454 Human neu . 


20 


1077.8 


84. 


3 


1110 


6 


BD185453 


BD185453 Human neu 


21 


998 


78. 


1 


2200 


10 


AY336083 


AY336083 Mus muscu 


22 


991.6 


77. 


6 


2469 


10 


AF041244 


AF041244 Rattus no 


23 


699.2 


54. 


7 


843 


6 


AR109899 


AR109899 Sequence 


24 


672.2 


52. 


6 


789 


6 


AR109632 


AR109632 Sequence 


25 


672.2 


52. 


6 


789 


6 


E12154 


E12154 cDNA encodi 


26 


672.2 


52. 


6 


789 


6 


AR300942 


AR300942 Sequence 


27 


640.2 


50. 


1 


781 


10 


AF394596 


AF394596 Mus muscu 


28 


601.2 


47. 


0 


3114 


10 


AF041246 


AF041246 Rattus no 


29 


578.6 


45. 


3 


1545 


10 


AY336084 


AY336084 Mus muscu 


30 


578.6 


45. 


3 


2117 


10 


AY336085 


AY336085 Mus muscu 


31 


554.4 


43. 


4 


1633 


6 


E33974 


E33974 cDNA clone 


32 


554.4 


43. 


4 


1843 


6 


AX549084 


AX549084 Sequence 


33 


554.4 


43. 


4 


1843 


6 


AX840914 


AX840914 Sequence 



34 


554 . 


4 


43 . 


4 


1 R7R 


Q 


AF04 1 9 4 S 






549 . 


6 


43 . 


o 


1 


O 


AY9 P HQ97 


A V O QHQ97 CJomiPnpo 


36 


541 . 


6 


42 . 


4 


i fins 


4 




nr IDfUiD V^clIlXo J_cLJ.lL 


37 


497 . 


8 


39 . 


o 


S97 


J. \j 


AY9 R SSQQ 


ZiV9SS c kQQ Mnc rniicrit 
j\i /ldd Dyy nus illxlscu 


38 


330 . 


8 


25 . 


9 


o o 


A 


A CM QQCI O 
/\JC *i -7 J VJ _L Z. 


/ir 4i) jDIZ UV1S dllc 


J Z7 


^04 
jui • 




^ -j • 


p 

o 


DO/ 


i n 




/Yc J Z7 *i 3 3> / nUS IRUSCU 


40 


281. 


6 


22. 


0 


328 


4 


AB092488 


AB092488 Bos tauru 


41 


263. 


2 


20. 


6 


501 


4 


AF532967 


AF532967 Ovis arie 


42 


249. 


2 


19. 


5 


344 


9 


F202078S03 


AF202080 Homo sapi 


43 


249. 


2 


19. 


5 


9785 


6 


AR178605 


AR178605 Sequence 


44 


249. 


2 


19. 


5 


9785 


6 


AX088174 


AX088174 Sequence 


45 


249.2 


19. 


5 


9785 


9 


AY062030 


AY062030 Homo sapi 



ALIGNMENTS 



RESULT 1 

E43974 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



E43974 



1564 bp DNA linear 
(HFGAN72Y) . 



PAT 31-JAN-2002 



Novel G protein-coupled receptor 
E43974 

E43974.1 GI:18625173 

JP 2000106888-A/3. 

unidentified 

unidentified 

unclassified. 

1 (bases 1 to 1564) 

Bergsma, D. J. and Ellis, C.E. 

Novel G protein-coupled receptor (HFGAN72Y) 
Patent: JP 2000106888-A 3 18-APR-2000; 
SMITHKLINE BEECHAM CORP 



OS Unidentified 

PN JP 2000106888-A/3 

PD 18-APR-2000 

PF 21-JUL-1999 JP 1999206116 

PR 30-APR-1997 US 08/846705 

PI DERK J BERGSMA, CATHARINE ELIZABETH ELLIS 

PC * C12N15/09,A61K38/00,A6lK38/00,A61K45/00,A61K48/00,A61Pl/00, PC 
\ A61P1/14, 

PC A61P9/02, A61P9/04, A61P9/10, A61P9/12, A61P11/06, A61P13/02, PC 
A61P13/08, 

PC A61P19/10,A61P25/14,A61P25/16,A61P25/18,A61P25/22,A61P25/24, 

PC A61P31/04, 

PC A61P31/10,A6lP31/12,A61P31/18,A61P33/00,A61P35/00,A61P37/08, 

PC A61P43/00, 

PC . C07K14/705,C07K16/28,C12Nl/21,C12N5/10,C12P21/02,G01N33/566, 

PC G01N33/577// 

PC C12P21/08, (C12N15/09 / Cl2Rl:91) , ( C12P2 1/02 , C12R1 : 91 ) ,C12N15/00, 

PC A61K37/02, 

PC A61K37/02,C12N5/00, (C12N15/00, C12R1 : 91) 

CC Strandedness : Single; 

CC Topology: Linear; 

FH Key Location/Qualifiers 

FT source 1. .1564 

FT /organism= 'Unidentified' . 

FEATURES Location/Qualifiers 



source 1. .1564 

/organism^" unidentified" 
/mol_type="genomic DNA" 
/db__xref="taxon: 32644" 

ORIGIN 

Query Match 99.7%; Score 1274.8; DB 6; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 9.4e~244; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 T C C CCT GT GCCT C CAGACT AT GAAGAT GAGTT T CT C C GCTAT CTGT G GCGT GAT TAT CT G 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 214 T C C CCT GT GCCT C CAGACT AT GAAGAT GAGTT T CT C C GCTAT CT GT GGC GT GAT TAT CT G 273 

Qy 121 TACCC7VAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I- 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 694 G CAGT CAT GGAAT GC AGCAGT GT GCT GC CT GAGCT AGC CAAC C GC ACAC GGCT CTT CT C A 753 

Qy 601 GT CT GT GAT GAAC GCT GGG C AGAT GACCT CT AT C C CAAGAT CT AC CACAGT TGCTTCTTT 660 

I I I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 754 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 813 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 



Qy 721 AAGCTCTGGGGCCGCCAGATCGCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 7 80 

I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 87 4 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

Qy 841 CGCGCCTT CCTGGCT GAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCT G 900 

I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 994 CGCGCCTTCCT GGCT GAAGTGAAGCAGAT GC GT GCACG GAG GAAGACAGC CAAGAT GCT G 1053 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db .1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTAC7UVCTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1354 TCCTTGTAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 

Qy .1261 AC C AC AGT GCT GC C CT GA 127 8 

I I I I I I I I I I I I I I I I I I 
Db 1414 ACCACAGTGCTGCCCTGA 14 31 



RESULT 2 

E50810 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



E50810 1564 bp DNA linear 

Novel G protein-bound receptor (HFGAN 72X) . 
E50810 

E50810.1 GI:13023197 

JP 2000060578-A/l. 

unidentified 

unidentified 

unclassified. 

1 (bases 1 to 1564) 

Derk,J.B. and Catharine, E . E . 

Novel G protein-bound receptor (HFGAN 72X) 

Patent: JP 2000060578-A 1 29-FEB-2000; 

SMITHKLINE BEECHAM CORP 



PAT 18-JUN-2001 



COMMENT 



FEATURES 

source 



ORIGIN 



OS Unidentified 

PN JP 2000060578-A/l ■ 

PD 29-FEB-2000 

PF 21-JUL-1999 JP 1999206115 

PR 30-APR-1997 US 08/846704 

PI DERK J BERGSMA, CATHARINE ELIZABETH ELLIS 

PC C12N15/09,A61K31/70,A61K38/00,A6lK39/00, A61K39/395, A61K39/395, 
PC A61K45/00, 

PC A6lK48/00,A61P3/04,A61P9/00, A61P11/06, A61P13/00, A61P25/00, PC 
A61P25/16, 

PC A61P25/18, A61P2S/20,A61P25/22, A61P31/04 „ A61P31/10, A61P31/12, 
PC A61P31/18, 

PC A61P35/00,A61P37/00,C07K14/705, C12N5/10, C12P21/02, C12Q1/02, PC 
G01N33/53, 

PC G01N33/566//C07K16/28,C12N15/00,A6lK37/02, C12N5/00 CC 
Strandedness : Single; 
CC Topology: Linear; 

FH Key Location/Qualifiers 
FT source 1. .1564 

FT /organism=' Unidentified 1 . 

Location/Qualifiers 
1. .1564 

/organism= "unidentified" 
/mol_type="genomic DNA" 
/db xref="taxon: 32644" 



Query Match 99.7%; 
Best Local Similarity 99.8%; 
Matches . 1276; Conservative 



Score 1274.8; DB 6; 
Pred. No. 9.4e-244; 
0; Mismatches 2; 



Length 1564; 
Indels 0; 



Gaps 



0; 



Qy 



Db 



154 



ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 



Qy 



Db 



61 T C C CCT GT GC CT C CAGACT AT GAAGAT GAGT TT CT C C GCT AT CT GT GGC GT GAT T AT CT G 120 
I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I M I 
214 TCCCCTGTGCCTC CAGACT AT GAAGAT GAGT T T CT C C GC T AT CTGTGGCGT GAT TAT CT G 273 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n 1 1 1 1 1 1 1 1 1 

274 TACCCAA7\ACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

181 CTGGT GGGCAACACGCT GGT CT GCCT GGC CGT GT GGCGGAACCACCACAT GAGGACAGT C 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
334 CT G GT GGGCAACACGCT GGT CT GC CT GGC C GT GT GG C GGAACC ACC AC AT GAGGACAGT C 393 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 30.0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

574 GC C CT G GAC C GCT GGT AT GC CAT CT GC CAC CCACT ATT GT T CAAGAGCACAG CCCGGCGG 633 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

541 GCAGT C AT GGAAT GC AGCAGT GT GCT GCCT GAGCT AGCCAACC GCAC AC GGCT CTT CT CA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
694 GCAGT CAT GGAAT GCAGCAGT GT GCT GCCT GAGCT AGCCAACCGCACACGGCT CTT CTCA 753 

601 GT CTGTGAT GAAC GCT GGGCAGATGACCTCTAT CCCAAGATCTACCACAGTTGCTT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
754 GTCTGT GAT GAACGCT GGGCAGATGACCT CTAT CC CAAGATCTACCACAGTT GCTTCTTT 813 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 

1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i.i 'i 1 1 1 1 1 1 1 1 1 

874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

841 CGCGCCTTCCTGGCT GAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
994 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 1053 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I U I I I I II 

1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 
I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

1 1 1 1 ii 1 1 1 i 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACTUVGTCCTTG 1353 

1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
1354 T C CT T GC AGAGC C GAT GCT C C AT CT CCAAAAT CT CT GAGCAT GT GGTGCT C ACCAGCGT C 1413 



Qy 1261 ACCACAGTGCTGCCCTGA 127 8 

I I I I I I I I I I I I I I I I II 
Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 3 

E50811 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



ORIGIN 



E50811 1564 bp DNA linear PAT 18-JUN-2001 

Novel G protein-bound receptor (HFGAN 72X) . 

E50811 

E50811.1 GI:13023198 

JP 2000060578 7 A/2. 

unidentified 

unidentified 

unclassified. 

1 (bases 1 to 1564) 

Derk,J.B. and Catharine, E . E . 

Novel G protein-bound receptor (HFGAN 72X) 

Patent: JP 2000060578-A 2 29-FEB-2000; 

SMITHKLINE BEECHAM CORP 

OS Unidentified 

PN JP 2000060578-A/2 

PD 29-FEB-2000 

PF 21-JUL-1999 JP 1999206115 

PR 30-APR-1997 US 08/846704 

PI DERK J BERGSMA, CATHARINE ELIZABETH ELLIS 

PC C12N15/09,A6lK31/70,A61K38/00,A61K39/00,A61K39/395,A6lK39/395, 
PC A61K45/00, 

PC A61K4 8/00,A61P3/04,A61P9/00,A6lPll/06,A61P13/00,A61P25/00, PC 
A61P25/16, 

PC A61P25/18,A61P25/20,A61P25/22,A61P31/04,A6lP31/10,A61P31/12, 
PC A61P31/18, 

PC A61P35/00,A61P37/00, C07K14/705, C12N5/10, C12P21/02, C12Q1/02, PC 
G01N33/53, 

PC G01N33/566//C07K16/28, C12N15/00, A61K37/02, C12N5/00 CC 
Strandedness : Single; 
CC Topology: Linear; 

FH Key Location/Qualif iers 

FT source 1. .1564 

FT /organism= 'Unidentified* . , . 

Location/Qualifiers 
1. .1564 

/organism= "unidentified" 
/mol_type=" genomic DNA" 
. /db xref="taxon: 32644" 



Query Match 99.7%; 
Best Local Similarity 99.8%; 
Matches 1276; Conservative 



Score 1274.8; DB 6; 
Pred. No. 9.4e-244; 
0; Mismatches 2; 



Length 1564; 
Indels 0; Gaps 



0; 



Qy 



Db 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 



Qy 



61 T C C C CT GT G C CT C CAGACTAT GAAGAT GAGT TT CT CC GCT AT CT GT GGCGT GAT TAT CT G 120 



1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 273 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 4 53 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 54 CCGGCCAGCCTGCT.GGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I LI I I I I I 

Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

Qy 601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 754 GT CT GT GAT GAAC GCT GGGCAGAT GAC CT CT AT C CCAAGAT CT AC CACAGT TGCT T CTTT 813 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 874 AAGCTCTGoGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCG^ 933 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

Qy 841 CG CGCCT T CCT GG CT GAAGT GAAGC AGAT GCGT GCAC GGAGGAAGAC AGC CAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 994 CGC GCCT TC CT GGCTGAAGT GAAGCAGAT GC GT GCAC GGAGGAAGAC AGC CAAGAT GCT G 1053 



Qy 



901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 
I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAAGAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I 
Db 1354 TCCTTGTAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 

Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I 'I I II I I 
Db 1414 ACCACAGTGCTGCCCTGA 1431 
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AX299473 1564 bp DNA linear PAT 26-NOV-2001 

Sequence 1 from Patent EP1154019. 

AX299473 

AX299473.1 GI: 17129230 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Crania ta; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Bergsma, D. J. and Ellis, C.E. 
G-protein coupled receptor (hfgan72x) 
Patent: EP 1154019-A 1 14-NOV-2001; 
SmithKline Beecham Corporation (US) 

Location/Qualifiers 

1. .1564 

/organisitF= n Homo sapient 
/mol_type="unassigned DNA" 
/db xref="taxon:9606" 



Query Match 99.7%; 
Best Local Similarity 99.8%; 
Matches 1276; Conservative 



Score 1274.8; DB 6; 
Pred. No. 9.4e-244; 
0; Mismatches 



Lengthy 1564 ; 



2; Indels 



0 ; Gaps 



0; 



Qy 

Db 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 
I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 



61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 273 

121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | 
274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
334 CT GGT GGG CAAC ACGC T GGT CT GCCT GGC C GT GT GG C GGAAC C AC C ACAT GAGGACAGT C 393 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I I 
514 GTCATCGCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I 1.1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | 
634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

541 GCAGT CAT GGAAT GCAGCAGT GT GCT GCCT GAGCT AGCCAACCGCACACGGCT CTTCT CA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
754 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 813 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCAC^rCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

994 CGCGCCTTCCT GGCT GAAGT GAAG CAGAT GCGT GCACG GAGGAAGACAGC CAAGATGCTG 1053 



Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 



Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 1081 CTCAGTGGCAT^ATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 1234 CTCAGTGGCAT^VTTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 1354 T C CTT GC AGAGCC GAT GCT C CAT CT CCAAAAT CT CT GAGC AT GT G GT G CT C ACC AGC GT C 1413 



Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
Db 1414 ACCACAGTGCTGCCCTGA 1431 
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AX299475 
Sequence 3 
AX299475 
AX299475.1 



1564 bp 
from Patent EP1154019. 

GI:17129231 



DNA 



linear 



PAT 26-NOV-2001 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Bergsma,D.J. and Ellis, C.E. 
G-protein coupled receptor (hfgan72x) 
Patent: EP 1154019-A 3 14-NOV-2001; 
SmithKline Beecham Corporation (US) 
■i'** Location/Qualif iers 
1. .1564 

/organism="Homo sapiens" 
/mol_type="unas signed DNA" 
/db xref="taxon:9606" 



Query Match 99.7%; Score 1274.8; DB 6; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 9.4e-244; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213- 

61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 273 

121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I 

334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I \\ I I I I I I I I I I I I I I I I I I I I 
394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I II I 1 1 I I I I I I 1 1 I I i 1 1 1 I I 1 1 1 I 1 1 I I I 1 1 1 1 1 1 I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I 1 1 1 1 

514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

694 GCAGTCAT GGAAT GC AGCAGTGT GCT GCCT GAGCT AGCCAACCGCACACGGCT CTTCTCA 753 

601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
754 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 813 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

814 AT.TGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATA r i'rCCGC 873 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I 
934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 994 CGCGCCTTCCT GGCT GAAGT GAAGCAGAT GC GT GCAC GGAGGAAGACAG CCAAGAT G CT G 1053 ■ 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 ^ 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I II I I I I I I I I I 
Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 10 81 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I |J II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I .. 
Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I II I I I I II I II I I I I I I I I II I I I I 

Db 12 94 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 12 01 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1354 TCCTTGTAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 

Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
Db 1414 ACCACAGTGCTGCCCTGA 1431 
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AX549082 1564 bp ' DNA linear PAT 26-NOV-2002 

Sequence 367 from Patent WO02061087. 

AX549082 

AX549082.1 GI:25813851 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Burmer,G.C, Roush,C.L. and Brown, J. P. 

Antigenic peptides, such as for G protein-coupled receptors 
(GPCRs), antibodies thereto; ^and systems for identifying such 
antigenic peptides 

Patent: WO 02061087-A 367 08-AUG-2002; 
Lifespan Biosciences, Inc. (US) 

Location/Qualifiers 

1. .1564 

/organism="Homo sapiens" 
/mol_type="unassigned DNA" 
/db xref="taxon:9606" 



Query Match 



99.7%; Score 1274.8; DB 6; Length 1564; 



Best Local Similarity 99.8%; Pred. No. 9.4e-244; 
Matches 1276; Conservative 0; Mismatches 2; Indels 



0; Gaps 



0; 



Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 214 TCCCCTGTGCCTC CAGACT AT GAAGAT GAGT TT CTC CGCT AT CT GT GGCGTGAT TAT CT G 273 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 274 TACCCA7WVCAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 334 CT GGT GGGCAACACGCTGGT CT GCCT GGCCGT GTGGCGGAACCACCACAT GAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GT CAT C CCCT ATCTACAGGCTGT GT CCGTGT CAGT GGCAGTGCTAACT CT CAGCTT CATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

Qy 541 GCAGTCAT GGAAT GCAGCAGT GTGCT GCCT GAGCT AGCCAACCGCACACGGCT CTT CT CA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

Qy 601 GT CT GT GAT GAAC GCT GGGCAGAT GACCT CT AT C CCAAGAT CT ACCACAGT TGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 754 GT C T GT GAT GAAC GCT GGG CAGAT GAC CT CT AT CCCAAGAT CT ACCACAGT TGCTTCTTT 813 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 



993 



Qy 841 CGCGCCTTCCTGGCT GAAGT GAAGCAGAT GC GT GC AC GGAG GAAGACAGC CAAGAT GCT G 900 

I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I II I I 
Db 994 CGCGCCTTC CT GGCT GAAGTGAAGCAGAT GC GT GC AC GGAGGAAGACAGCCAAGAT GCT G 1053 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

1.1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I 1 1 I I I I 1 1 I I I I 1 1 I I I I I i I I I I I I I I I I I I I I II I I I I I I 

Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACTKACTTC 1233 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I 
Db 1354 TCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 

Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 7 
AX746121 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



AX746121 1564 bp DNA linear PAT 12-JUN-2003 

Sequence 4 from Patent EP1156110. 

AX746121 

AX746121.1 GI: 31744927 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae ; •■■ Homo . 

1 

Bergsma,D.J. and Ellis, C.E. 
G-protein coupled receptor (HFGAN72Y) 
Patent: EP 1156110-A 4 21-NC>V-2001 ; 
SMITHKLINE BEECHAM CORPORATION (US) 

Location/Qualifiers 

1. .1564 

/organism="Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon: 9606" 
/note="HGS EST 554692" 



ORIGIN 



Query Match 99.7%; Score 1274.8; DB 6; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 9.4e-244; 

Matches 127 6; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 273 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 334 CT GGT GGGCAACACGCT GGT CT GCCT GGCCGT GT GGC GGAACCACCACAT GAGGACAGT C 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I II I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 514 GT CAT CCCCTAT CTACAGGCT GTGT CCGT GT CAGT GGCAGT GCT AACTCT CAGCTT CATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

Qy 541 G CAGT CAT GGAAT G C AGCAGT GT GCT GC CT GAGCT AGC CAAC C GC ACAC GGCT CTT CT CA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

- , "• 

Qy 601 GT CT GT GAT GAACGCTGGGCAGATGACCT CT AT CCCAAGAT CT ACCACAGTT GCTT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 754 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 813 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I | I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ll l I 

934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

841 CGCGCCTTCCTGGCT GAAGT GAAG C AGAT G C GT G C AC G GAG GAAGAC AGC CAAG AT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
994 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 1053 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I 

1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

961 AAGAG GGT GT T C GGGAT GT T C C GC CAAGC C AGT GAC C GC GAAGCT GT CT AC GCCT G CT T C 1020 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 12 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1354 TCCTTGTAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 

1261 ACCACAGTGCTGCCCTGA 127 8 

I I I I I I I I I I I I I I I I I I 
1414 ACCACAGTGCTGCCCTGA 14 31 



RESULT 8 
AX840912 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 



JOURNAL 



linear PAT 16-DEC-2003 



AX840912 1564 bp DNA 

Sequence 8 from Patent WO03075945. 
AX840912 

AX840912.1 GI:39979051 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Eulenberg, K. , Steuernagel, A. , Haeder,T. and Broenner, G. 
Cg8327, cgl0823, cgl8418, cgl5862, cg3768, cgll447 and cgl6750 
homologous proteins involved in the regulation of energy 
homeostasis 

Patent: WO 03075945-A 8 18-SEP-2003; 

DeveloGen Aktiengesellschaf t fuer entwicklungsbiologische; 
• For s chung (DE) 



FEATURES Location/Qualifiers 
source 1. .1564 

/organism="Homo sapiens" 
/mol_type="unassigned DNA" 
/db_xref= M taxon: 9606" 

ORIGIN 

Query Match 99.7%; Score 1274.8; DB 6; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 9.4e-244; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 
Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 I I I 1 1 1 I I I I I I 1 1 I 1 1 I i I I I 1 1 I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I IN I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 214 TCCCCTGTGCCTC CAGACT AT GAAGAT GAGT TT CT CC GCTAT CT GT G GC GT GATTAT CTG 273 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CT G GT G GGCAACAC GCT GGT CT GCCT GGC C GT GT GGC GGAACC AC CACAT GAGGACAGT C 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I II I I 

Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 
Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I 

Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

Qy 601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I 
Db 754 GTCTGT GAT GAACGCT GGGCAGATGACCT CTATCCCAAGATCTACCACAGTT GCTT CTTT 813 



Qy 



661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 
I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



720 



Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 8 73 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 7 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I fl I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 8 40 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 994 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 1053 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I 
Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II 
Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II 
Db 1234 CTCAGTGGC7WVTTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTG7\AGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I 

Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I II I I I I I I 

Db 1354 T C CT T GC AGAGCC GAT G CT C CAT CT C CAAAATCT CT GAGC ATGT GGT GCT CAC CAGC GT C 1413 

Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 9 
AF041243 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



PRI 24-FEB-1998 



AF041243 1564 bp mRNA linear 

Homo sapiens orexin receptor- 1 mRNA, complete cds . 
AF041243 

AF041243.1 GI:2897123 



Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
1 (bases 1 to 1564) 

Sakurai,T., Arnemiya,A. , Ishii,M., Matsuzaki, I . , Chemelli, R. M. , 
Tanaka,H., Williams , S . C . , Richardson, J .A. , Kozlowski, G . P . , 



TITLE 



JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 



TITLE 
JOURNAL 



FEATURES 

source 



Wilson, S., Arch,J.R.S., Buckingham, R. E . , Haynes,A.C. r A. Carr,S.A., 
Annan, R.S., McNulty, D . E . , Liu,W.-S., Terrett , J. A. , 
Elshourbagy,N.A. , Bergsrna,D.J. and Yanagisawa,M. 
Orexins and orexin receptors: a family of hypothalamic 
neuropeptides and G protein-coupled receptors that regulate feeding 
behavior 

Cell 92 (4), 573-585 (1998) 
98150861 
9491897 
2 (bases 
Sakurai, T. 
Tanaka, H. , 
Wilson, S . , 



CDS 



1 to 1564) 
, Amemiya,A., 

Williams, S.C. 

Arch, J. R. S . , 



Ishii,M. , Matsuzaki, I . , Chemelli, R.M. , 
, Richardson, J. A. , Kozlowski, G. P . , 
Buckingham, R. E. , Haynes,A.C, A. Carr,S.A. 
, Liu,W.-S., Terrett, J. A. , 
and Yanagisawa,M. 



Annan, R. S . , McNulty, D. E. 
Elshourbagy, N . A. , Bergsma,D.J. 

Direct Submission 

Submitted ( 07- JAN-1998 ) HHMI/ Department of Molecular Genetics, 
University of Texas Southwestern Medical Center at Dallas, 5323 
Harry Hines Blvd., Rm. Y5.224, Dallas, TX 75235-9050, USA 

Location/Qualifiers 

1. .1564 

/organism="Homo sapiens" 
/mol_type= ,f mRNA" 
/db_xref ="taxon : 9606" 
/ ch r omo s ome = " 1 ' 1 
/map=" lp33" 
154. .1431 

/note="OXlR; G protein-coupled receptor" 
/ codon_start=l 

/product-"orexin receptor-1" 
/protein_id="AAC39601. 1" 
/db_xref="GI: 2897124" 

/ trans la tion="MEPSATPGAQMGVPPGSREPSPVPPDYEDEFLRYLWRDYLYPKQ 
YEWVL I AAWAVFVVALVGNT LVCLAW RNHHMRT VTN Y F I VN L S LADVLVT AI CL P A 
SLLVDITESWLFGHALCKVIPYLQAVSVSVAVLTLSFIALDRWYAICHPLLFKSTARR 
ARGSILGIWAVSLAIMVPQAAVMECSSVLPEL7VNRTRLFSVCDERWADDLYPKIYHSC 
FFIVTYLAPLGLMAMAYFQI FRKLWGRQI PGTTSALVRNWKRPSDQLGDLEQGLSGEP 
QPRGRAFLAEVKQMRTVRRKTAKMLMv^LVFALCYLPISVT^NVTjKRVFGMFRQASDRE 
AVYACFTFSHWLVYANSAANPIIYNFLSGKFREQFKAAFSCCLPGLGPCGSLKAPSPR 
SSASHKSLSLQSRCSISKISEHVVLTSVTTVLP ,, 



ORIGIN 



Query Match 99.7%; 
Best Local Similarity 99.8%; 
Matches 1276; Conservative 



Score 1274.8; DB 9; 
Pred. No. 9.4e-244; 
0; Mismatches 2; 



Length 1564; 
Indels 0; Gaps 



0; 



Qy 



Db 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 
I I I I I I I I I I I I I II I I I I I I I I I I I I I 1.1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 



Qy 



Db 



61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

214 T C C C CT GT GC CT C CAGACT AT GAAGAT GAGTTT CT C C GCT AT CT GT G GCGT GATTAT CT G 273 



Qy 

Db 



121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 



180 



333 



181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I M | | | | M II I I I I I I I I I 

334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I II I I I 
394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCT7VACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

601 GTCT GTGATGAACGCTGGGCAGAT GACCT CTATCCCAAGAT CTACCACAGTT GCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | 
754 GTCT GT GAT GAACGC T GG GCAGAT GACCT CT AT CC CAAGAT CTAC CACAGTT GCT T CT T T 813 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

841 CGCGCCTT CCT GGCT GAAGTGAAGCAGAT GCGTGCACGGAGGAAGACAGCCAAGAT GCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

994 CGCGC CTTCCTGGCT GAAGT GAAGCAGAT GC GT GCACG GAGGAAGACAGC CAAGAT GCT G 1053 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 
I I I I I I I I I I I I I I I I I I l-l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | | | | | | | | 
1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 



Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | 
1354 TCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 



1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
1414 ACCACAGTGCTGCCCTGA 1431 
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LOCUS 

DEFINITION 
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KEYWORDS 
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TITLE 

JOURNAL 

FEATURES 

source 



ORIGIN 



AX280925 1278 bp DNA linear PAT 02-NOV-2001 

Sequence 548 from Patent WO0177172. 

AX280925 

AX280925. 1 GI: 16608218 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 

Lehmann-Bruinsma, K. , Liaw, C.W. and Lin,I.L. 

Non-endogenous, consstitutively activated known g protein-coupled 
receptors 

Patent: WO 0177172-A 548 18-OCT-2001; 
Arena Pharmaceuticals, Inc. (US) 

Location/Qualifiers 

1. .1278 

/organism="Homo sapiens" 
• /mol_type="unassigned DNA" 
/db xref="taxon:9606" 



Query Match 99.4 : 6 ; ; 
Best Local Similarity 99.6%; 
Matches 1273; Conservative 



Score 1270; DB 6; 
Pred. No. 8.7e-243; 
0; Mismatches 5; 



Length 1278; 
Indels 0; 



Gaps 



0; 



Qy 

Db 

Qy 

Db 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I | | | | | | | 
1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 

61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 
I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | 
61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 



60 



60 



120 



120 



121 T AC CCAAAACAGTAT GAGT GGGT CC T CAT C GCAGC C TAT GTGGCT GT GT T CGT C GT GGC C 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

181 CT GGT G GGCAAC AC GCT GGT CTGCCT GGC C GT GT GG C GGAACC ACC ACAT GAGGACAGT C 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCT7UVCTCTCAGCTTCATC 420 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG" 480 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

541 GC AGT CAT GGAAT GC AGCAGT GT GCT GC CT GAGCT AGCCAAC C GC ACAC GGCT CTT CT CA 600 

I I I I I I I I II I I'l I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

541 GCAGTCAT GGAAT GCAGC AGTGT GCT GCCT GAGCT AGCCAAC CGCACACGGCT CTT CTC A 600 

601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
601 GT CT GT GAT GAACGCT GGGCAGAT GACCT CTAT CCCAAGAT CTACCACAGTT GCTT CTTT 660 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 

I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I 
661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 7 80 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I f-fl I I I I I I I I I I I I I I I I I I 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 84 0 

841 CGCGCCTTCCT GGCT GAAGTGAAGCAGATGCGT GCACGGAGGAAGACAGCCAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I II M 
841 CGCGCCTT C CT GGCT GAAGTGAAGCAGAT GCGTGC ACGGAGGAAGACAAAAAAGATGCT G 900 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 



Db 



961 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M M i I 

AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 



Qy 


1021 


Db 


1021 


Qy 


1081 


Db 


1081 


Qy 


1141 


Db 


1141 


Qy 


1201 


Db 


1201 


Qy 


1261 


Db 


1261 



ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 

CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I J 

GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

T CCTT GC AGAGCC GAT GCT C CAT CT CCAAAAT CT CT GAGC AT GT GGT GCTC ACCAGC GT C 1260 

ACCACAGT GCT GC C CT GA 1278 

I I I I I I I I I I I I I I I I I I 

ACCACAGTGCTGCCCTGA 1278 

RESULT 11 ^ 
AR216117 

LOCUS AR216117 1209 bp DNA linear PAT 25-SEP-2002 

DEFINITION Sequence 1 from patent US 6410701. 
ACCESSION AR216117 

VERSION AR216117.1 GI: 23314430 

KEYWORDS 

SOURCE Unknown . 

ORGANISM Unknown. 

Unclassified. 
REFERENCE 1 (bases 1 to 1209) 

AUTHORS Soppet,D.R., Li, Y. and Rosen, C. A. 
TITLE Human neuropeptide receptor 

JOURNAL Patent: US 6410701-A 1 25-JUN-2002; 
FEATURES Location/Qualifiers 
source 1. .1209 

/organism="un known" 
/mol_type="genomic DNA" 

ORIGIN 

Query Match 94.4%; Score 1205.8; DB 6; Length 1209; 

Best Local Similarity 99.8%; Pred. No. 5.4e-230; 

Matches 1207; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

| | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 12 0 

| | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I . 
Db 61 TCCCCTGTGCCT CC AGACT AT GAAGAT GAGTT T CT C CGCT AT CT GT GGC GT GATT AT CT G 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 



121 T AC C CAAAAC AGT AT GAGT G GGT C CT CAT C GC AG C CTAT GT G GCT GT GTT CGT C GT G GCC 18 0 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

361 GTCATCCCCTATCTACAGGCTGT GTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I II I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

601 GT CT GT GAT G AAC GC T G G G C AGAT GAC CT CTAT C C CAAGAT C T AC C AC AGT TGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
601 GT CTGT GAT GAACGCTGGGCAGATGACCT CTAT CCCAAGATCTACCACAGTT GCTT CTTT 660 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
781 C C CT C AGAC CAGCT GGGGGAC CT G GAGCAGGG CCT GAGT GGAGAGC CC CAGC C CC GGGGC 8 

841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

841 CGCGCCTT CCTGGCT GAAGTGAAGCAGAT GCGTGCACGGAGGAAGACAGCCAAGAT GCT G 900 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 



Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 



Qy 1201 TCCTTGCAG 1209 

I I I I I I II 
Db 1201 TCCTTGTAG 12 09 



RESULT 12 

BD185452 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BD185452 1209 bp DNA linear PAT 17-JUN-2003 

Human neuropeptide receptor. 

BD185452 

BD185452.1 GI:31877652 

JP 2002360288-A/l. 

unidentified 

unidentified 

unclassified. 

1 (bases 1 to 1209) 

Soppet,D.R., Li,Y. and Rosen, C. A. 

Human neuropeptide receptor 

Patent: JP 2002360288-A 1 17-DEC-2002 ; 

HUMAN GENOME SCIENCES INC 

OS Unidentified 

PN JP 2002360288-A/l 

PD 17-DEC-2.002 

PF 02-MAY-2002 JP 2002130838 

PI DANIEL R SOPPET,YI LI , CRAIG A ROSEN 

PC C12N15/09,A61K31/7088,A61K38/00,A61K45/00, A61K48/00, A61P3/04 , 
PC A61P3/06, 

PC A61P3/10,A61P9/10,A6lP9/12,A61P25/08, A61P25/18, A61P25/22, PC 
A61P25/28, 

PC A61P35/00,A61P43/00, C07K14/705, C07K16/24, C12N1/15, C12N1/19, PC 
C12N1/21, 

PC C12N5/10, Cl'ZQl/68, C12N15/00, C12N5/00, A61K37/02 CC 

Strandedness : Single; 

CC Topology: Linear; 

CC Human neuropeptide receptor 

FH Key Location/Qualifiers 

FT source 1. .1209 

FT /organism^ Unidentified' . 

Location/Qualifiers 
1. .1209 . 

/ organism="unidentif ied" 
/mol__type=" genomic DNA" 
/db xref="taxon: 32644" 



Query Match 94.0%; Score 1201; DB 6; Length 1209; 

Best Local Similarity 99.6%; Pred. No. 4.8e-229; 

Matches 1204; Conservative 0; Mismatches 5; Indels 0; Gaps 



0; 



Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I 

Db 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I . . 
Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCCCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

Qy 181 CT GGT GG GCAACACGCT GGT CTGC CT GGCCGT GT GGC GGAAC CAC CACAT GAGGACAGT C 2 40 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r 1 1 1 1 

Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGGTGACGTTCTGGTGACTGCTATCTGCCTG 300 

1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 241 ACC7UVCTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

Qy 361 GT CAT C C C CT AT C T AC AGGCT GT GT C C GT GT C AGT GGCAGT GCT AACT CT C AGCT T CAT C 420 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I II I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

Qy 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

Qy 601 GT CT GT GAT GAAC G CT GGGCAGAT GAC CT CT AT CC CAAGAT CTAC CAC AGTT GCT T CTT T 660 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 GT CT GT CAT GAAC GCTGGGC AGAT GAC CT CT AT C C CAAGAT CTAC CAC AGTT GCT T CTT T 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I II I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 7 80 

II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 AACCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 



Qy 7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 C GC GC CTT C CT GGCT GAAGT GAAGC AGAT GCGT GCAC GGAGGAAGACAGC CAAGAT GCT G 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II 
Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 12 00 

Qy 1201 TCCTTGCAG 1209 

I I I I I I II 
Db 1201 TCCTTGTAG 1209 



RESULT 13 

E43973 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



DNA linear 
(HFGAN72Y) . 



PAT 31-JAN-2002 



E43973 1133 bp 

Novel G protein-coupled receptor 
E43973 

E43973. 1 GI:18625172 
JP 2000106888-A/2. 
unidentified 
unidentified 
unclassified. 
1 (bases 1 to 1133) 
Bergsma,D.J. and Ellis, C.E. 
Novel G protein-coupled receptor 
Patent: JP 2000106888-A 2 18-APR-2000; 
SMITHKLINE BEECHAM CORP 
OS Unidentified 
PN JP 2000106888-A/2 
PD 18-APR-2000 
PF 21-JUL-1999 JP 1999206116 
PR 30-APR-1997 US 08/846705 
PI DERK J BERGSMA, CATHARINE ELIZABETH ELLIS 

PC C12N15/09,A61K38/00,A61K38/ 00,A61K45/ 00, A61K4 8/00, A61P1/00, PC 
A61P1/14, 



(HFGAN72Y) 



FEATURES 

source 



ORIGIN 



PC A61P9/02,A61P9/04„A61P9/10, A61P9/12, A61P11/06, A61P13/02, PC 
A61P13/08, 

PC A61P19/10,A61P25/14,A61P25/16,A61P25/18,A61P25/22,A61P25/24, 
PC A61P31/04, 

PC A61P31/10,A6lP31/12,A61P31/18,A61P33/00,A61P35/00,A61P37/08, 
PC A61P43/00, 

PC C07K14/705,C07K16/28,C12N1/21, C12N5/10, C12P21/02, G01N33/566, 
PC G01N33/577// 

PC C12P21/08, (C12N15/09, C12R1:91) , (C12P21/02, C12R1 : 91 ) , C12N15/00, 
PC A61K37/02, 

PC A61K37/02, C12N5/00, (C12N15/00, C12R1 : 91) 
CC Strandedness : Single; 
CC Topology: Linear; 

FH Key Location/Qualifiers 
FT source 1. .1133 

FT /organism= 'Unidentified 1 . 

Location/ Qualifiers 
1. .1133 

/ organism= f, unidentif ied" 
/mo l_type=" genomic DNA" 
/db xref="taxon: 32644" 



Query Match 85.0%; Score 1086.4; DB 6; Length 1133; 

Best Local Similarity 99.9%; Pred. No. 3.3e-206; 

Matches 1087; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTC CAGACT AT GAAGAT GAGT TT CT C CGCT ATCT GT GGCGT GATT AT CT G 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 T C CCCT GT GC CT CC AGACT AT GAAGAT GAGTTT CT C C GCT ATCT GT GGCGT GAT TAT CT G 120 

Qy 121 TACCCAAAACAGTAT GAGTGGGT CCT CAT CGCAGCCT AT GT GGCTGT GTT CGT CGT GGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 AGCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCxGCCTG 300 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I i 1 1 1 1 I 1 1 1 1 1 1 1 I 1 1 1 1 1 I 1 1 1 1 1 I 1 1 I 1 1 I 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 I I I 1 1 1 1 1 I I 

Db 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 



541 



541 



601 



601 



661 



661 



721 



721 



781 



781 



841 



841 



901 



901 



961 



961 



GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

G CAGT CAT GGAATGCAGC AGT GT GCT GC CT GAGCT AGC CAAC C GC ACAC G GCT CTT CT C A 



600 



600 



660 



GT CT GT GAT GAACGCT GG GCAGAT GACCT CTAT C C CAAGAT CT AC C AC AGTT GCTT CTT T 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 



AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 



CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 



840 



840 



900 



C GC G C CTT C CT G GCT GAAGT GAAGCAGAT GC GT GCACGGAGGAAGACAG C CAAGAT GCT G 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
CGCGCCTTCCTGGCT GAAGT GAAGCAGAT GCGTGCACGGAGGAAGACAGCCAAGAT GCT G 900 

ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 



1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

1081 CTCAGTGG 1088 

I I I II I I I 

1081 CTCAGTGG 1088 



RESULT '14 

AX746120 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



AX746120 
Sequence 
AX746120 
AX746120. 



3 from Patent 



1133 bp 
EP1156110. 



DNA 



linear PAT 12-JUN-2003 



1 GI:31744926 



( human ) 



Homo sapiens 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 



Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 

AUTHORS Bergsma,D.J. and Ellis, C.E. 

TITLE G-protein coupled receptor (HFGAN72Y) 

JOURNAL Patent: EP 1156110-A 3 21-NOV-2001; 

SMITHKLINE BEECHAM CORPORATION (US) 
FEATURES Location/Qualif iers 

source 1. .1133 

/organism="Homo sapiens" 
/mol_type=" genomic DNA" 
/db_xref="taxon:9606" 
/note="HGS EST 557082" 

ORIGIN 



Query Match 85.0%; Score 1086.4; DB 6; Length 1133; 

Best Local Similarity 99.9%; Pred. No. 3.3e-206; 

Matches 1087; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


1 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


60 




, , i i i i i i ■ i i i i i i i i i i i ■ i i i i i i i i i i i i i i i i i i i i i i i i i i i j i i i i i i 

1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 II II M 1 1 1 1 1 1 1 II 1 1 M II II II 1 i 1 ! 1 II 1 II II II II 1 




Db 


1 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCClGGCAGCAGAGAbCUG 


DU 


Qy 


61 


TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 


120 




i i i i i i i i i i i i i i i i t i i i i i i i i i i I t I I I I I I I I I 1 1 1 1 i 1 1 ( 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
|| | | | || | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


TCCCCTGTGCCTCCAGACTATGAAGATGAGi 1 1 CI CCGC1A1 tl bl bbUOl bAl lAlllb 


i on 

1ZU 


Qy 


121 


TACCCT^AAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 




i i i i i i i i i i i i i i i i i i i t i i i i i i ) i i i i i i i i i i i i i i i i i i i i i i i i i 1 1 

II | 1 1 1 1 1 II II II 1 1 1 1 1 1 II 1 II 1 1 1 1 1 II 1 1 1 I II II 1 M M 1 1 1 1 1 1 M M 1 1 1 1 1 




Db 


121 


TACCCAAAACAGTATGAGTGGGTCCTCA1 LGCAGCL1A1 CjIGCjUIvjKjI i Lbi L-bl vjCj^u 




Qy 


181 


CT GGT GGGCAACACGCT GGT CT GCCT GGCCGT GT GGCGGAACCACCACAT GAGGACAGT C 


240 




i I I I I M I 1 1 1 1 1 I I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II II II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 

1 II I II 1 II 1 1 1 1 II 1 1 M 1 II 1 1 1 1 1 1 ! M 1 1 II II 1 II II 1 M 1 II II 1 II I 1 1 1 1 1 1 




Db 


lo 1 


C I yjyj 1 GGGCAACAUCjL- 1 CjCj iLl bLt 1 o(jjL.L.Ij 1 Kj 1 lj^^oLxfA/\L^w\^W\w\± \jj-\Sj i 


Z. " u 


Qy 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 


Qy 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 


Qy 


361 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


420 






I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


GT CAT CC C CT AT CT ACAG GCT GT GT C CGT GT C AGT GGC AGT G CTAACT CT CAGCT T CAT C 


420 


Qy 


421 


GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


480 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 




Db 


421 


GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


480 


Qy 


481 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 




I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 


Qy 


541 


GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 


600 




| | | | | | | | | | I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 




Db 


541 


GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 


600 



Qy 

Db 



601 
601 



GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GT CT GT GAT GAAC GCT G GGCAGAT GAC CT CT AT C C C AAGAT CT AC C ACAGTT G CT T CT TT 



660 
660 



Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 841 CGCGCCTTCCT GGCT GAAGT GAAGCAGAT GC GT GCAC GGAGGAAGACAGC CAAGAT GCT G 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 

Qy 1081 CTCAGTGG 1088 

I I I I I I I I 
Db 1081 CTCAGTGG 1088 



RESULT 15 

E43972 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



DNA linear 
(HFGAN72Y) . 



E43972 1170 bp 

Novel G protein-coupled receptor 
E43972 

E43972.1 GI:18625171 

JP 2000106888-A/l. 

unidentified 

unidentified 

unclassified. 

1 (bases 1 to 1170) 

Bergsma, D.J. and Ellis, C.E. 

Novel G protein-coupled receptor (HFGAN72Y) 

Patent: JP 2000106888-A 1 18-APR-2000; 

SMITHKLINE BEECHAM CORP 

OS Unidentified 

PN JP 2000106888-A/l 

PD 18-APR-2000 

PF 21-JUL-1999 JP 1999206116 



PAT 31-JAN-2002 



FEATURES 

source 



ORIGIN 



PR 30-APR-1997 US 08/846705 

PI DERK J BERGSMA, CATHARINE ELIZABETH ELLIS 

PC C12N15/09, A61K38/00,A61K38/00,A61K45/00,A61K48/00,A61Pl/00, PC 
A61P1/14, 

PC A61P9/02,A61P9/04,A61P9/10,A61P9/12,A61P11/06,A61P13/02, PC 
A61P13/08, 

PC A6lP19/10,A61P25/14,A61P25/16,A61P25/18,A61P25/22,A61P25/24, 
PC A61P31/04, 

PC A61P31/10,A61P31/12,A61P31/18,A61P33/00,A61P35/00,A61P37/08, 
PC A61P43/00, 

PC C07K14/705, C07K16/28, C12N1/21, C12N5/10, C12P21/02, G01N33/566, 
PC G01N33/577// 

PC C12P21/08, (C12N15/09,C12R1:91) , (C12P21/02, C12R1 : 91) , C12N15/00, 
PC A61K37/02, 

PC A61K37/02,C12N5/00, (C12N15/00, C12R1 : 91) 
CC Strandedness : Single; 
CC Topology: Linear; 

FH Key Location/Qualifiers 
FT source 1. .1170 

FT /organism=' Unidentified* . 

Location/Qualifiers 
1. .1170 

/organism= "unidentified" 
/mol_type=" genomic DNA" 
/db xref="taxon: 32644" 



Query Match 85.0%; 
Best Local Similarity .99.9%; 
Matches 1087; Conservative 



Score 1086.4; DB 6; 
Pred. No. 3.3e-206; 
0; Mismatches 1; Indels 



Length 117 0; 

0; Gaps 



0; 



Qy 



Db 



ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 



Qy 

Db 



61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
61 T C CC CT GT GC CT CCAGACT AT GAAGATGAGT T T CT C CGCT AT CT GT GGC GT GATT AT CT G 12 0 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



121 



121 



181 



T AC C C AAAACAGT AT GAGT G GGT C CT CAT C GC AG C CT AT GT GGCT GT GT T C GT C GT GG C C 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 



CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 
I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

18 1- CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 
I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 



180 



180 



240 



240 



300 



300 



360 



360 



Qy 



361 



GT CAT CCCCT AT CTACAGGCT GT GT CCGT GT CAGT GGCAGT GCTAACT CT CAGCTT CATC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



420 



Db 



361 GTCATCCCCTATCTACAGGCTGTGTCCGT'GTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 



Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

Qy 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

. I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

Qy 601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 841 CGCGCCTT CCT GGCT GAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGAT GCT G 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCT^ATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGG 1088 

I I I I I I I I 
Db 1081 CTCAGTGG 1088 
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ALIGNMENTS 



RESULT 1 
AAS00491 

ID . ARS004 91 standard; cDNA; 1278 BP. 
XX 

AC AAS00491; 
XX 

DT 17-MAY-2001 (first entry) 
XX 

DE Human neuropeptide receptor cDNA. 
XX 

KW Human; neuropeptide receptor; neuropeptide Y receptor; obesity; 

KW nervous system disorder; hyperprolif erative disorder; diabetes mellitus; 

KW cardiovascular disorder; autoimmune disorder; infectious disorder; 

KW eating behaviour disorder; narcolepsy; neurological disease; 



KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 



narcotics addiction; nicotine addiction; alcohol addiction; gene therapy; 
protein co-ordinate data; chromosome 1; ss. 



Homo sapiens . 



FT 



Key 
CDS 



Location/Qualif iers 
1. .1278 
/*tag= a 

/product= "neuropeptide receptor 



XX 

PN WO200117532-A1. 
XX 

PD 15-MAR-2001. 
XX 

PF 07-SEP-2000; 2000WO-US024518 . 
XX 

PR 10-SEP-1999; 99US-00393696 . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Soppet DR, Li Y, Rosen CA; 
XX 

DR WPI; 2001-183276/18. 

DR P-PSDB; AAU00438. 
XX 

PT A new nucleic acid encoding a human neuropeptide receptor polypeptide, 

PT useful for preventing, treating or ameliorating obesity, narcolepsy, 

PT neurological disease and addiction to narcotics, nicotine and alcohol. 
XX 

PS Claim 4; Fig 1; 385pp; English. 
XX 

CC The present sequence encodes for a novel human neuropeptide receptor 

CC which shows sequence homology to the neuropeptide Y receptor. Two splice 

CC variants of the neuropeptide receptor (AAU00439-AAU00440 ) and a possible 

CC mutant (AAU00442) are also described. Polypeptides and polynucleotides of 

CC the neuropeptide receptor are useful for diagnosing, preventing, or 

CC treating a pathological condition in a subject related to the central 

CC nervous and peripheral nervous systems (CNS and PNS) . The polypeptides 

CC and polynucleotides may be used to treat hyperprolif erative, 

CC cardiovascular, autoimmune, nervous system or infectious disorders e.g. 

CC cancer, heart disease, rheumatoid arthritis, Alzheimer's disease, HIV 

CC infection and diabetes mellitus . In particular they are useful for 

CC preventing, treating or ameliorating a medical condition in a mammal such 

CC as obesity/eating behaviour disorders, narcolepsy, neurological disease, 

CC addiction to narcotics, nicotine and alcohol, chronic pain, acute pain, 

■CC migraine headaches and anxiety disorders. The polynucleotides encoding 

CC the neuropeptide receptor can also be used in gene therapy methods for 

CC treating such diseases 
XX 

SQ Sequence 1278 BP; 220 A; 426 C; 347 G; 285 T; 0 U; 0 Other; 

Query Match 100.0%; Score 1278; DB 4; Length 1278; 

Best Local Similarity 100.0%; Pred. No. 3.1e-289; 

Matches 1278; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



QY 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 
| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

Qy 121 TACCCAA7\ACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I II I I I I I I I I I I 1 I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I ill I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I • 

Db 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

Qy 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

Qy 601 GTCT GT GAT GAAC GCT GGGCAGAT GACCT CT AT C CCAAGAT CTAC CACAGT TGCTTCTTT 660 

I I I I I I I I I I I I I II I I I I I I I I II I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 GT CT GT GAT GAAC GCT GG GC AGAT GAC CT CT AT C C CAAGAT CTAC CACAGT TGCTTCTTT 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n 1 1 1 1 1 

Db 721 AAGCTCTGGGGCCGCCAGATCC'CCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

Qy 841 CGCGCCTTCCTGGCT GAAGT GAAG CAG AT GC GT GCAC GGAG GAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 841 CGCGCCTTC CTGGCT GAAGT GAAGCAGAT GCGT GCAC GGAGGAAGACAGC CAAGAT GCT G 900 



Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 

Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 CTCAGTGGCAAATTCCGGGAGCAGTTT7\AGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
Db 1261 ACCACAGTGCTGCCCTGA 1278 

RESULT 2 
AAV63468 

ID AAV63468 standard; cDNA; 1564 BP. 
XX 

AC AAV63468; 
XX 

DT 26-JAN-1999 (first entry) 
XX 

DE cDNA encoding G-protein coupled receptor (HFGAN72X) polypeptide. 
XX 

KW G-protein coupled receptor; HFGAN72X; HIV infection; anorexia; cancer; 

KW bulimia; asthma; Parkinson's disease; acute heart failure; 

KW urinary retention; osteoporosis; angina pectoris; myocardial infarction; 

KW benign prostatic hypertrophy; neurological disorder; ss. 

XX 

OS Homo sapiens. , 
XX 

FH Key Location/Qualifiers 

FT CDS 154. .1431 

FT /*tag= a 

FT /product- "HFGAN72X" 

XX 

PN EP875566-A2. 
XX 

PD 04-NOV-1998. 
XX 

PF 27-OCT-1997; 97EP-00308563 . 



XX 

PR 30-APR-1997; 97US-00846704 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 
XX 

PI Bergsma DJ, Ellis CE; 
XX 

DR WPI; 1998-559432/48. 

DR P-PSDB; AAW80456. 
XX 

PT New human G-protein coupled receptor HFGAN72X polypeptide and 

PT polynucleotide - useful as diagnostic reagents and for treating e.g. HIV 

PT infection, cancer and Parkinson's disease. 

XX 

PS Claim 3; Page 7; 24pp; English. 
XX 

CC The present sequence encodes a G-protein coupled receptor (HFGAN72X) 

CC polypeptide. HFGAN72X polypeptides and polynucleotides are useful for 

CC diagnosing diseases related to over or under expression of HFGAN72X 

CC proteins by identifying mutations in the HFGAN72X gene using HFGAN72X 

CC probes, or determining HFGAN72X protein or mRNA expression levels. 

CC HFGAN72X polypeptides are also useful for screening for compounds which 

CC affect activity of the protein. Diseases that can be treated with 

CC HFGAN72X include HIV infections, pain, anorexia, cancers, bulimia, 

CC asthma, Parkinson's disease, acute heart failure, hypotension, 

CC hypertension, urinary retention, osteoporosis, angina pectoris, 

CC myocardial infarction, ulcers, allergies, benign prostatic hypertrophy, 

CC and psychotic^ and neurological disorders 

XX 

SQ Sequence 1564 BP; 271 A; 511 C; 435 G; 347 T; 0 U; 0 Other; 



Query Match 99.7%; Score 1274.8; DB 2; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 1.9e-288; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 


l 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


60 






I | | I I I I I I I I I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


154 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


213 


Qy 


61 


TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 


120 






1 1 1 1 1 1 1 1 1 1 II I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 




Db 


214 


T CCCCTGT GCCT CCAGACTAT GAAGATGAGTTTCT CCGCTAT CTGTGGCGTGATTATCT G 


273 


Qy 


121 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db->- 


274 


■TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


333 


Qy 


181 


CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


240 




1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


334 


CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


393 


Qy 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 




I I I I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


394 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


453 


Qy 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 


454 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


513 




361 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


420 




1 I I I I I I I I I I I I I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 




Db 


514 


GTCAT CCCCT AT CTACAGGCT GT GT CCGT GT CAGT GGCAGTGCT AACT CT CAGCTT CATC 


573 


Qy 


491 

*± *L ± 


f;rrrTf;f;Arrf;rTf;f;TATf;rrATrTG;cCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


480 




1 1 1 1 1 1 1 1 LI 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 




Db 


574 


GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


633 


Qy 


401 


arc raTnafTrrATrrTn^GCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 




1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


634 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


693 


Qy 


t/i 1 

3fi X 


rrarTraT(^;r^7\ATf;rAr;rA(^T^;Tr;rTp;rrTGAGCTAGCCAACCGCACACGGCTCTTCTCA 


600 




| | II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


694 


GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 


753 


Qy 


bUl 


rTfTfTr ATr a a rr^nTr^nnr A f^ATnArPTrTATrcr A AGATCTACCACAGTTGCTTCTTT 


660 




| | || | | M | || | | | | | | 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 




Db 


754 


GTCT GT GAT GAAC GCT GGGCAGAT GACCT CT AT CCCAAGAT CT ACCACAGTT GCTT CTTT 


813 


A,, 

Qy 


DO X 


ATTrTrArrTArrTnnrrrrAfT^firrTCATGGCCATGGCCTATTTCCAGATATTCCGC 


720 




II 1 II 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 




Db 


814 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 


873 


Qy 


7 9 1 
/ Z X 


a ArrTrTar^r^r^rrr;rrAr^ATrrrr(^nrAcrArrTCAGCACTGGTGCGGAACTGG7\AGCGC 


780 




| | | I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


874 


AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 


933 


Qy 


1 R 1 
/OX 


rrr^r txczacc Ar^rTr^nr^nArrT^f;Af;cA(^GGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 


840 




| | | | | I | I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


934 


CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 


993 


Qy 


O ft X 


r r r r r r tt r r t nr; r t a a a A agc AG at GC GT GC AC G GAGGAAGACAGCCAAGAT GCT G 


900 




| | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


994 


CGCGCCTT CCT GGCTGAAGTG7VAGCAGAT GCGTGCACGGAGGAAGACAGCCAAGAT GCTG 


1053 


A, 7 

Qy 


Q 01 


ATnr^TCr^'Pf^rT^rTr^f^TrTTrnrCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 


960 




| | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1054 


ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 


1113 


Pit r 

Qy 


y o x 


AA^A^;r;f;TnTTrGGGATGTTCCGCC7VAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 


1020 




I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1114 


AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 


1173 


yy 


_L VJ <c. X 


AC CTT CT C C CACT GGCT G GT GT AC G CCAAC AGCGC T GC CAAC C C CAT CAT CTAC AACTT C 


1080 




I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1174 


ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 


1233 


At; 

Qy 


1 flftl 

X U O JL 


CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 


1140 




1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 
i i i i i i i i i i i i i i i i \ \ i i i i j i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 




Db 


1234 


CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 


1293 


Qy 


1141 


GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 


1200 




| | | | | I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1294 


GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 


1353 



Qy 


i o m 


1 1 kjUAUAuULtjAl CjCI LCCjI L 1 CCAAAA1 CI CI CjAUCAl C^l IjIjL L,AL^AU^Ij1 ^ 








I 1 1 1 1 I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 




Db 


1354 


T CCT T GCAGAGC C GAT GCTC CAT CT C C AAAAT CT CT GAGC AT GT GGT GCT C AC CAGCGTC 


1413 


Qy 


1261 


ACCACAGTGCTGCCCTGA 1278 








1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


1414 


ACCACAGTGCTGCCCTGA 1431 





RESULT 3 
AAV68514 

ID AAV68514 standard; cDNA; 1564 BP. 
XX 

AC AAV68514; 
XX 

DT 29-JAN-1999 (first entry) 
XX 

DE Nucleotide sequence of a probe HGS EST 554 692. 
XX 

KW Probe HGS EST 554692; G-protein coupled receptor family; HFGAN72Y; 

KW mutation; probe; agonist; antagonist; activation; inhibition; 

KW gene therapy; antibody; immune response; vaccine; HIV-1; HIV-2; cancer; 

KW anorexia; bulimia; asthma; Parkinson's disease; acute heart failure; 

KW hypotension; hypertension; urinary retention; osteoporosis; 

KW angina pectoris; myocardial infarction; ulcer; allergies; 

KW psychotic disorder; neurological disorder; gene mapping; ss. 

XX 

OS Synthetic. 

OS Homo sapiens. 
XX 

PN EP875565-A2. 
XX 

PD 04-NOV-1998. 
XX 

PF 27-OCT-1997; 97EP-00308554 . 
XX 

PR 30-APR-1997; 97US-00846705 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 
XX 

PI Bergsma DJ, Ellis C; 
XX 

DR WPI; 1998-570286/49. 
XX 

PT New G-protein coupled receptor HFGAN72Y polypeptide and polynucleotide - 

PT useful as diagnostic reagents and for prevention and treatment of HIV 

PT infections, cancer, osteoporosis and Parkinson f s disease. 
XX 

PS Example 1; Page 19-20; 22pp; English. 
XX 

CC This is the nucleotide sequence of the probe HGS EST 554692 used in the 

CC method of the invention involving the G-protein coupled receptor, 

CC HFGAN72Y. Its polypeptides and polynucleotides are useful for diagnosing 

CC susceptibility to diseases by detecting mutations in the HFGAN72Y gene 

CC using probes containing the HFGAN72Y nucleotide sequence, and can 

CC diagnose diseases associated with HFGAN72Y imbalance by determining 



CC HFGAN72Y polypeptide or mRNA expression levels. Agonists/antagonists can 

CC be used in treatment to activate/inhibit HFGAN72Y -activity, in addition 

CC to direct administration of antisense sequences to prevent expression, or 

CC HFGAN72Y polypeptides to treat conditions associated with a lack HFGAN72Y 

CC protein. Gene therapy may also be used to affect endogenous HFGAN72Y 

CC polypeptide production. HFGAN72Y antibodies are useful for inducing an 

CC immune response to immunise and prevent diseases, and for isolating 

CC HFGAN72Y clones or purifying the polypeptides by affinity chromatography. 

CC HFGAN72Y polypeptides can be administered directly or as a vaccine to 

CC inoculate against diseases. Diseases diagnosed, prevented or treated 

CC include HIV-1 or HIV-2 infections, pain, cancers, anorexia, bulimia, 

CC asthma, Parkinson's disease, acute heart failure, hypotension, 

CC hypertension, urinary retention, osteoporosis, angina pectoris, 

CC myocardial infarction, ulcers; allergies, benign prostatic hypertrophy, 

CC and psychotic and neurological disorders. The HFGAN72Y polypeptide is 

CC also useful for mapping the gene to a chromosome, allowing gene 

CC inheritance to be studied through linkage analysis 

XX 

SQ Sequence 1564 BP; 269 A; 508 C; 436 G; 347 T; 0 U; 4 Other; 



Query Match 99.7%; Score 1274.8; DB 2; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 1.9e-288; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCCCCTGTGCCTC C AGACT AT GAAGAT GAGT T T CT C C GCT AT CT GT GGC GT GAT TAT CT G 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 214 TCCCCTGTGCCTC C AGACT AT GAAGAT GAGT T T CTC C GCT AT CT GT G GC GT GATT AT CT G 27 3 

Qy 121 TACCCAAAACAGT AT GAGT GGGT CCT CATCGCAGCCT AT GT GGCT GT GTT CGT CGT GGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I 
Db 274 T AC CCAAAACAGT AT GAGT GGGT CC T CAT C GC AGC CTAT GT GGC T GT GT T CGT CGT GGC C 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

•Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ill I I I 

Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 634 GCCCGTGGCTGCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I 
Db 694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

Qy 601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 7 54 GT CT GT GAT GAAC GCT GGGC AGAT GAC CT CT AT CC CAAGAT CTAC CACAGTT GCTT CT TT 813 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I | | I I I I I I I |.| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

Qy 7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

Qy 8 41 CGC GCCT T C CT GGCTGAAGT GAAGCAGATGC GT GCAC GGAG GAAGACAGC CAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II 
Db 994 CGC GC CT T C CT GG CTGAAGT GAAGC AGAT GC GT G CAC GGAGGAAGACAGC CAAGAT GCT G 1053 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCC7UVCAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 1201 T C CTT GCAGAGC C GAT GCT C CGT CT C CAAAAT CT CT GAGCAT GT GGT GCT CAC C AGC GT C 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I fl I I I I I I I I I I I I I I I I I I I 

Db 1354 T C CTT GT AGAGC C GAT G CT C CGT CT C CAAAAT CT CT GAGCAT GT GGT GCT CAC CAGC GT C 1413 

Qy 1261 AC C AC AGT G CT GCC CT GA 1278 

II I I I I I I I I I I I I I I I I 

Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 4 



AAS17464 

ID AAS17464 -standard; DNA; 1564 BP. 
XX 

AC AAS17464; 
XX 

DT 25-FEB-2002 (first entry) 
XX 

DE Human G protein-coupled receptor HFGAN72 variant CDS. 
XX 

KW Human; G protein-coupled receptor; GPCR; HFGAN72; ds; 

KW bacterial infection; fungal infection; protozoan infection; 

KW viral infection; human immunodeficiency virus; HIV; cancer; diabetes; 

KW Parkinson's disease; osteoporosis; myocardial infarction; ulcer; asthma; 

KW allergy; angina pectoris; renal disease; depression; schizophrenia; 

KW anorexia; obesity; Kallman's syndrome; hypothalamic disorder; 

KW idiopathic hormone deficiency; gigantism; migraine; pain; lung disease; 

KW burn; sleep disorder; jet lag; Huntington's disease; gene therapy. 

XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 154. .1431 

FT /*tag= a 

FT /product= "HFGAN71X variant" 

XX 

PN US2001025031-A1. 
XX 

PD 27-SEP-2001. 
XX 

PF 06-APR-2001; 2001US-00828538 . 
XX 

PR 08-JUN-1998; 98US-008 8524P . 

PR 22-JUL-1998; 98US-0093726P . 

PR 08-JUN-1999; 99US-00328014 . 
XX 

PA (ELLI/) ELLIS C E. 

PA (KWOK/) KWOK C. 

PA (BODS/) BODSWORTH N J. 

PA (HALS/) HALSEY W. 

PA (HORN/) HORN S V. 

XX 

PI Ellis CE, Kwok C, Bodsworth NJ, Halsey W, Horn SV; 
XX 

DR WPI; 2001-624968/72. 

DR P-PSDB; AAU11188. 
XX 

PT Isolated HFGAN72 receptor useful for treatment of a patient having need 

PT of HFGAN72 receptor and in the detection and treatment of disease, e.g. 

PT infections such as bacterial, fungal, protozoan and viral infections and 

PT cancers. 
XX 

PS Disclosure; Fig 5; 75pp; English. 
XX 

CC The invention relates to an isolated polypeptide, the HFGAN72 receptor or 

CC its variant, encoded by the 8 exon sequences given in the specification. 

CC HFGAN72 is a G protein-coupled receptor (GPCR) . HFGAN72 is useful for the 

CC treatment of a patient having need of HFGAN72 receptor where HFGAN72 is 



CC administered by providing to the patient DNA encoding HFGAN72 and 

CC expressing HFGAN72 in vivo (i.e by gene therapy). HFGAN72 is particularly 

CC useful for applications in the detection and treatment of disease, e.g. 

CC infections such as bacterial, fungal, protozoan and viral infections, 

CC particularly infections caused by human immunodeficiency virus (HIV)-l or 

CC HIV-2, cancers, diabetes, Parkinson's disease, osteoporosis, myocardial 

CC infarction, ulcers, asthma, allergies, angina pectoris, renal disease, 

CC depression, schizophrenia, anorexia, obesity, Kallman's syndrome, 

CC hypothalamic disorders, idiopathic hormone deficiency (e.g. gigantism), 

CC migraine, pain, lung diseases, burns, sleep disorders, jet lag, 

CC Huntington's disease and many other diseases and disorders given in the 

CC specification. The present sequence is the coding, sequence of an 

CC alternative allele of the human HFGAN72 receptor 

XX 

SQ Sequence 1564 BP; 267 A; 514 C; 437 G; 346 T; 0 U; 0 Other; 



Query Match 99.7%; Score 1274.8; DB 4; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 1.9e-288; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCC C CT GT GC CT C CAGACT AT GAAGAT GAGT TT CT C C GCT AT CT GT GGCGT GAT T AT CT G 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGCGATTATCTG 273 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I 'FI I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1.1 I I I I I I I I I I I ^ 

Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 



Qy 



541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 



Db 



694 



753 



Qy 601 GT CT GT GAT GAAC GCT GGGCAG AT GAC CT CT AT CCCAAGAT CT ACC ACAGT T GC TT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 754 GT CT GT GAT GAAC GCT GGGCAGAT GAC CT CT AT C CCAAGAT CT ACC ACAGT TGCTTCTTT' 813 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 



Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

Qy 841 CGCGCCTT CCT GGCT GAAGT GAAGCAGAT GCGTGCACGGAGGAAGACAGCCAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 994 CGCGCCTT CCT GGCT GAAGT GAAGCAGAT GCGTGCACGGAGGAAGACAGCCAAGAT GCT G 1053 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 



Qy 1021 ACCTTCTCCCACT GGCT GGT GTACGCCAACAGCGCTGCCAACCCCAT CAT CTACAACTTC 1080 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 



Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 12 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 



Qy 1201 TC CTT GCAGAG C C GAT GCT C CGT CT C CAAAAT CT CT GAGCAT GT GGT GCT CAC CAGC GT C 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1354" T C CTT GCAGAGC C GAT GCT C CGT CTCCAAAAT CT CT GAGCAT GT GGT GC^ GAC CAGC GT C 1413 

Qy 1261 ACCACAGTGCTGCCCTGA 127 8 

I I I I I I II I I I I I I I I I I 

Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 5 
AAF32103 

ID ' AAF32103 standard; cDNA; 1564 BP. 
XX 

AC AAF32103; 



XX 

DT 10-APR-2001 (first entry) 
XX 

DE Human HFGAN72 receptor coding sequence SEQ ID NO: 12. 
XX 

KW Human; mouse; rat; Lig72A; Lig72B; neuropeptide receptor; HFGAN72; 

KW truncation mutant; ligand; neurodegenerative disorder; pain; 

KW eating disorder; behaviour disorder; mood disorder; ss. 
XX 

OS Homo sapiens . 
XX 

PN WO200100787-A2. 
XX 

PD 04-JAN-2001. 
XX 

PF 22-JUN-2000; 2000WO-US0172 51 . 
XX 

PR 25-JUN-1999; 99US-0141156P . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 

PA (SMIK ) SMITHKLINE BEECHAM PLC. 
XX 

PI Bingham S, Darker J, Liu W, Martin JD, Parsons AA, Patel SR; 
XX 

DR WPI; 2001-071483/08. 
XX 

PT Polynucleotides encoding Lig 72A polypeptides or their variants, which 

PT are useful in the treatment of a disease or disorder associated with 

PT pain, e.g. enhanced or exaggerated sensitivity to pain, hyperalgesia, 

PT neuropathic pain and back pain. 
XX 

PS Disclosure; Fig 6; lOlpp; English. 
XX 

CC The present invention 1 provides the protein and coding sequences for the 

CC human, mouse and rat HFGAN receptor ligand Lig72A. It also provides 

CC truncated mutant versions. These, and their agonists and antagonists, are 

CC all useful in the treatment of eating, neurodegenerative, behaviour, 

CC mood, sexual, hormonal and sleep disorders, pain, depression, epilepsy 

CC and acute inflammatory conditions 

XX 

SQ Sequence 1564 BP; 271 A; 511 C; 435 G; 347 T; 0 U; 0 Other; 

Query Match 99.7%; Score 1274.8; DB 4; Length 1564; 
Best Local Similarity 99.8%; Pred. No. 1.9e-288; 

Matches 127 6; Conservative 0; Mismatches 2; Indels 0; Gaps 0 

iV •• ' " 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 T CC CCT GT GC CT C CAGACT AT GAAGAT GAGT TT CT C C GCT AT CT GTGGC GT GAT TAT CT G 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 273 



Qy 

Db 



121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 

274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 



181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

394 ACCAACTACTT CATT GT CAACCT GTCCCT GGCT GACGTT CT GGT GACTGCTAT CT GCCT G 453 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
694 GCAGT CAT GGAAT GCAGCAGTGT GCT GCCT GAG CTAGCCAACCGCACAC GGCT CTT CTCA 753 

601 GTCT GTGAT GAACGCT GGGCAGAT GACCT CTAT CCCAAGAT CTACCACAGTT GCTT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
754 GT CT GT GAT GAACGCT GGGCAGAT GACCT CTAT C C CAAGAT CT AC CACAGTT GCTT CTT T 813 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

. I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

994 CGCGCCTT CCT GGCT GAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGC CAAGAT GCT G 1053 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 



961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

| | | I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 



Qy 


1021 


Db 


1174 


Qy 


1081 


Db 


1234 


Qy 


1141 


Db 


1294 


Qy 


1201 


Db 


1354 


Qy 


1261 


Db 


1414 



ACCTTCTCCCACTGGCTGGTGTACGCC7^ACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACT^ACTTC 1233 

CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I III I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II 

TCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 

AC CAC AGT GCTGCCCT GA 1278 

I I I I I I I I I I I I I I I I I I 

ACCACAGTGCTGCCCTGA 1431 



RESULT 6 




ABA96021 




ID 


ABA96021 standard; cDNA; 1564 BP . 




XX 






AC 


ABA96021; 




XX 






DT 


12-MAR-2002 (first entry) 




XX 






DE 


HGS EST 554692. 




XX 






KW 


G-protein; receptor; HFGAN72Y; cytostatic; 


cardiant; analgesic; cancer; 


KW 


nootropic; tranquillising; neuroprotective 


; anti-asthmatic; gene therapy; 


KW 


infection; HIV-1; pain; anorexia; bulimia; 


Parkinson's disease; ulcer; 


KW 


cardiac disease; urinary retention; asthma 


; allergy; psychotic disorder; 


KW 


benign prostatic hypertrophy; neurological 


disorder; anxiety; delirium; 


KW 


schizophrenia; manic depression; dementia; 


mental retardation; EST; 


KW 


dyskinesia; Huntington's disease; Tourette 


' s syndrome; HIV-2; 


KW 


HGS EST 554692; expressed sequence tag; probe; ss. 


XX 






OS 


Homo sapiens. 




XX 






PN 


EP1156110-A2. 




XX 






PD 


21-NOV-2001. 




XX 






PF 


27-OCT-1997; 2001EP-00203010 . 




XX 






PR 


30-APR-1997; 97US-0084 6705 . 




PR 


27-OCT-1997; 97EP-00308554 . 




XX 






PA 


(SMIK ) SMITHKLINE BEECHAM CORP. 




XX 






PI 


Bergsma DJ, Ellis CE; 




XX 






DR 


WPI; 2002-084320/12. 




XX 







PT New polynucleotide encoding a G-protein coupled receptor designated 

PT HFGAN72Y is useful to diagnose and treat associated diseases including 

PT cancer,, infection, cardiac disease and psychotic and neurological 

PT disorders. 
XX 

PS Example 1; Page 19-20; 22pp; English. 
XX 

CC The sequence represents HGS EST 554 692. The sequence was used in the 

CC invention as a probe to screen a human genomic placenta phage library. 

CC The invention relates to a novel isolated polynucleotide encoding 

CC HFGAN72Y polypeptide. The polypeptide of the invention has cytostatic, 

CC cardiant, analgesic, tranquillising, nootropic, neuroprotective, and anti 

CC -asthmatic activity. The HFGAN72Y has a use in gene therapy. The HFGAN72Y 

CC polynucleotide or an HFGAN72Y polypeptide agonist are used to treat a 

CC subject in need of enhanced HFGAN72Y activity or expression. An HFGAN72Y 

CC antagonist or competitor, or nucleic acid which inhibits HFGAN72Y 

CC expression is used to treat a subject in need of decreased HFGAN72Y 

CC activity or expression. HFGAN72Y-associated diseases include infections, 

CC particularly by HIV-1 or HIV-2, pain, anorexia, bulimia, Parkinson's 

CC disease, cardiac diseases, cancers, ulcers, urinary retention, asthma, 

CC allergies, benign prostatic hypertrophy, and psychotic and neurological 

CC disorders including anxiety, schizophrenia, manic depression, delirium, 

CC dementia, severe mental retardation and dyskinesias such as Huntington's 

CC disease and Tourette's syndrome 

XX 

SQ Sequence 1564 BP; 269 A; 508 C; 436 G; 347 T; 0 U; 4 Other; 



Query Match 99.7%; Score 1274.8; DB 6; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 1.9e-288; 

Matches 127 6; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 


i 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


60 




1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 M 1 I 1 1 1 1 1 II 1 1 1 1 




Db 


154 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


213 


Qy 


61 


TC CCCT GT GC CT C CAGACT ATGAAGAT GAGT TT CT C CGCT AT CT GT GGCGT GAT TAT CT G 


120 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


214 


TCCCCT GTGC CT C CAGACT ATGAAGAT GAGTTT CTCCGCTATCT GT GGCGT GATTAT CTG 


273 


Qy 


121 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 




1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


274 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


333 


Qy 


181 


CT GGT GGGCAACACGCT GGT CT GCCTGGCCGTGT GGCGGAACCAC CACAT GAGGACAGT C 


240 




I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


334 


CT GGT GGGCAACACGCTGGTCT GCCT GGCCGT GT GGCGGAAGCACCACAT GAGGACAGT C 


393 


Qy 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii i 1 1 1 1 1 1 




Db 


394 


AC CAACTACT T CATT GT CAAC CT GT C CCT G GCT GAC GTT CT GGT GACT GCT AT CT GC CT G 


453 


Qy 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


454 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


513 


Qy 


361 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


420 




| | I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 





Db 


514 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


573 


Qy 


41 Z 1 




480 




1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M M 1 




Db 


574 


GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


633 


Qy 




bbbbbl bbbl bbAl LL1 bbbbAl blbbjbrL.1 bl bl bob^l vaLrb^/\± 1 bL-L-^^nuu^ ± 


540 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


634 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


693 


Qy 


541 


GCAGTbATGGAAi GbAGbAGlbl bbl bbbl bAbbiAbbbAAbbbbAbAbbbbl bl ibib/i 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M i 




Db 


694 


GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 


753 


Qy 


601 


GTCTGTGA1 GAAbbbl bbbbAbAl bAbbl bl Al LLL/\AbAl LlALbB.l,nbl IVjL-1 Itl l l 


660 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


754 


GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 


813 


Qy 


661 


ATTGTCACCTACCTGGCCCCACiGGbbblbAl bbbbAl bbbblAl 1 IbbAbAlAl ibbbb 


79 0 




1 1 1 1 1 1 1 1 1 1 1 1 ii ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


814 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 


873 


Qy 


721 


AAGCTCTGGGGbbGbbAGAl bbbbbbbALbAbb 1 LAbbAbl bbl bbbbAAbl b*Lji-Y/\LjL,ljrL, 


i p o 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


874 


AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 


933 


Qy 


781 


CCCTCAGACCAGCTGGGGGACC 1 bGAbbAbbbbb 1 bAbl bbAbAbbbbbAbbbbbUbjbjCb* 


9>a o 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


934 


CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 


993 


Qy 


841 


CGCGCCTTbCTGGb I GAAG1 bAAbbAbAl bbbl b UAL, b bAb bAAbAbAb L» L-/\>\<cr/\ I l Kd 


17 U \J 




i i i i i i i i i i i i i l i i i i i i i i i i i i i i i i i i i i i i i l i i i i i i i i i i i i i i i i i i i i i i 




Db 


994 


CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 


1053 


Qy 


901 


ATGGTGGTGCTGC 1 GG 1 b 1 1 bbbbb 1 b 1 bb 1 Abb 1 bbbbAl bAbbbl bbibAAl Lj l bb 1 1 


y u \j 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1054 


ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 


1113 


Qy 


961 


AAGAGGGTGT TbGGGAl bl 1 LbbbbAAbbbAb 1 bAbbbbbAAbbl bi bi Abbbbi bb 1 lb 


1090 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1.1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1114 


AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 


1173 


Qy 


1021 


Abb 1 1 b I bbbAbl bbb 1 bbi b 1 AbbbbAAbAbbbb 1 bbbAAbbbbAl v^irtL/A/iLl X 


1080 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db . 


1174 


ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 


1233 


Qy 


1 f\ O 1 

1081 


b TbAb i bbbAAAl 1 bbbbbAbbAbl 1 lAAbbb 1 bbbl 1 b I bb I bb 1 bbblbbb 1 1 \d 


114 0 

J_ _L *± \J 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




Db 


. 1234 


CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 


1293 


Qy 


1141 


bbi bbbl bbbbbl bi bl bAAbbbbbblAbi V_.UL*UoL» 1 Ub^l b 1 bbbAbbbAbAAb J. 1 *j 


1200 




1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 
M 1 1 M 1 M II II II II 1 M 1 II II II M M 1 1 M 1 1 M II II II I II 1 II 1 1 M II 1 II 




Db 


1294 


GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 


1353 


Qy 


1201 


TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 


1260 




| | | | M | | I I 1 I I 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1354 


TCCTTGTAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 


1413 



12 61 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
1414 ACCACAGTGCTGCCCTGA 1431 

RESULT 7 
AAI64173 

ID AAI64173 standard; cDNA; 1564 BP. 
XX 

AC AAI64173; 
XX 

DT 22-JAN-2002 (first entry) 
XX 

DE HFGAN72X G coupled receptor polypeptide partial sequence. 
XX 

KW Antibacterial; fungicide; virucide; protozoacide; anti-HIV; analgesic; 

KW cytostatic; nootropic; antiparkinsonian; cardiant; antiulcer; 

KW antiasthmatic; tranquiliser ; neuroleptic; antidepressant; anticonvulsant; 

KW osteopathic; HIV infection; pain; cancer; anorexia; bulimia; 

KW Parkinson's disease; acute ^heart failure; hypotension; hypertension; 

KW urinary retention; osteoporosis; angina pectoris; probe; 

KW myocardial infarction; ulcers; asthma; allergy; delirium; dementia; 

KW benign prostatic hypertrophy; anxiety; schizophrenia; manic depression; 

KW dyskinesia; G coupled receptor; HFGAN72X; 7 transmembrane receptor; ss. 

XX 

OS Homo sapiens. 



XX 

FH Key Location/Qualifiers 

FT CDS 154. .1362 

FT /*tag= a 

FT /partial 

FT /product= "HFGAN72X protein" 

FT /note= "The specification states that this is a partial 

FT sequence even though it contains start and stop codons ; 

FT HFGAN72X is a G coupled receptor polypeptide" 

FT /transl_except= (pos:991. .993, aa:Ala) 

XX 



PN EP1154019-A2. 
XX 

PD 14-NOV-2001. 
XX 

PF 27-OCT-1997; 2001EP-00203008 . 
XX 

PR 30-APR-1997; 97US-00846704 . 

PR 27-OCT-1997; 97EP- 00308563 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 
XX 

PI Bergsma DJ, Ellis CE; 
XX 

DR WPI; 2002-012659/02. 

DR P-PSDB; AAG78346. 
XX 

PT Nucleic acid encoding the HFGAN72X receptor, useful for diagnosis and 

PT treatment of e.g. infections, cancer, anorexia, bulimia, Parkinson's 

PT disease, and acute heart failure. 



Qy 

Db 



PS Example 3; Page 9; 24pp; English. 
XX 

CC The present sequence is that of a known partial nucleotide sequence 

CC encoding a HFGAN72X polypeptide (AAG78346) used as a probe to identify 

CC the HFGAN72X gene (AAI64173) . The specification describes a newly 

CC isolated polynucleotide encoding a human HFGAN72X G coupled receptor 

CC polypeptide. The protein of the invention has antibacterial, fungicide, 

CC virucide, protozoacide, anti-HIV, cardiant, analgesic, cytostatic, 

CC nootropic, antiparkinsonian, antiulcer, antiasthmatic, tranquiliser, 

CC neuroleptic, antidepressant, anticonvulsant and osteopathic activities. 

CC HFGAN72X polynucleotides (PNs) are used to express HFGAN72X in vivo, to 

CC treat diseases requiring increased activity or expression of HFGAN72X; 

CC for recombinant production of HFGAN72X; diagnose diseases by detecting 

CC mutations in genomic sequences and in chromosome identification and 

CC mapping. HFGAN72X polypeptides are used to raise specific antibodies; as 

CC therapeutic agents; to identify HFGAN72X protein-expressing clones; to 

CC purify HFGAN72X proteins; in vaccines. Cells transformed with HFGAN72X 

CC PNs are used to identify ( ant ) agonists of HFGAN72X, useful 

CC therapeutically. Nucleic acids that inhibit expression of HFGAN72X and 

CC polypeptides that compete with ligands for binding to HFGAN72X proteins 

CC are also useful therapeutically and diagnostically . HFGAN72X-related 

CC diseases include infections (bacterial, viral, fungaL or protozoal, 

CC particularly HIV-1 or -2); pain; cancer; anorexia; bulimia; Parkinson's 

CC disease; acute heart failure; hypotension; hypertension; urinary 

CC retention; osteoporosis; angina pectoris; myocardial infarction; ulcers; 

CC asthma; allergy; benign prostatic hypertrophy; anxiety; schizophrenia; 

CC manic depression; delirium; dementia; severe mental retardation and 

CC dyskinesias 

XX 

SQ Sequence 1564 BP; 269 A; 508 C; 436 G; 347 T; 0 U; 4 Other; 



Query Match 99.7%; Score 1274.8; DB 6; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 1.9e-288; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I II I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I 

Db 214 T C CC CT GT GC CT CC AGACT AT GAAGAT GAGT TT CT C CGCTAT CT GT GGC GT GATT ATCT G 273 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

i in 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 im\ I I I I 

Db 274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 



454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 . 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCAT.GGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I II I I I I II I I I I I I I I I I I I I I I I II I I I I I I I II I I I I II I I I II I I I I I I I I I I I I I 

694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 

754 GT CT GT GAT GAACGCTG G GCAGAT GACCT CT AT CC CAAGAT CT AC CAC AGT T GCT T CT T T 813 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r 1 1 1 1 

814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 
I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

841 C GCGC CTT C CTGGCT GAAGT GAAGCAGAT GC GT GC ACGGAGGAAGACAGCCAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

994 CGC GC CT T CCT GGCT GAAGT GAAGCAGAT GC GT GC ACGGAGGAAGACAGCCAAGAT GCT G 1053 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 
I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 
' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 

1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

117 4 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 
I I III j I I I I I I I I I II I I I I I I I I II i I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 



Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1354 TCCTTGTAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 

Qy 1261 AC C AC AGT GCT G CC CT GA 127 8 

I I I I I I I I I I I I I I I I I I 
Db 1414 AC CACAGT GCT GCCCT GA 1431 



RESULT 8 
AAI64172 

ID AAI64172 standard; cDNA; 1564 BP. 
XX 

AC AAI64172; 
XX 

DT 22-JAN-2002 (first entry) 
XX 

DE Human HFGAN72X G coupled receptor polypeptide cDNA. 
XX 

KW " Antibacterial; fungicide; virucide; protozoacide; anti-HIV; analgesic; 

KW cytostatic; nootropic; antiparkinsonian; cardiant; antiulcer; 

KW antiasthmatic; tranquiliser ; neuroleptic; antidepressant; anticonvulsant; 

KW osteopathic; HIV infection; pain; cancer; anorexia; bulimia; 

KW Parkinson's disease; acute heart failure; hypotension; hypertension; 

KW urinary retention; osteoporosis; angina pectoris; myocardial infarction; 

KW ulcers; asthma; allergy; delirium; dementia; 

KW benign prostatic hypertrophy; anxiety; schizophrenia; manic depression; 

KW dyskinesia; G coupled receptor; HFGAN72X; 7 transmembrane receptor; ss. 
XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT CDS 154. .1431 

FT /*tag= a 

FT /product= "HFGAN72X protein" 

FT /note= "G coupled receptor polypeptide" 

XX 

PN EP1154019-A2. 
XX 

PD 14-NOV-2001. 
XX 

PF 27-OCT-1997; 2001EP-00203008 . 
XX 

PR 30-APR-1997; 97US-00846704 . . 

PR 27-OCT-1997; 97EP-00308563 . 

XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 
XX 

PI Bergsma DJ, Ellis CE; 
XX 

DR WPI; 2002-012659/02. 

DR P-PSDB; AAG78345. 
XX 

PT Nucleic acid encoding the HFGAN72X receptor, useful for diagnosis and 

PT treatment of e.g. infections, cancer, anorexia, bulimia, Parkinson's 



PT disease, and acute heart failure. 

XX - _ 

PS Claim 3; Page 7; 24pp; English. 
XX 

CC The present sequence is that of a cDNA encoding a HFGAN72X polypeptide 

CC (AAG78345) . The specification describes a newly isolated polynucleotide 

CC encoding a HFGAN72X G coupled receptor polypeptide. The protein of the 

CC invention has antibacterial, fungicide, virucide, protozoacide, anti-HIV, 

CC cardiant, analgesic, cytostatic, nootropic, antiparkinsonian, antiulcer, 

CC antiasthmatic, tranquiliser, neuroleptic, antidepressant, anticonvulsant 

CC and osteopathic activities. HFGAN72X polynucleotides (PNs) are used to 

CC express HFGAN72X in vivo, to treat diseases requiring increased activity 

CC or expression of HFGAN72X; for recombinant production of HFGAN72X; 

CC diagnose diseases (or susceptibility to them) by detecting mutations in 

CC genomic sequences and in chromosome identification and mapping. HFGAN72X 

CC polypeptides are used to raise specific antibodies; as therapeutic agents 

CC ; to identify HFGAN72X protein-expressing clones; to purify HFGAN72X 

CC proteins; in vaccines. Cells transformed with HFGAN72X PNs are used to 

CC identify (ant ) agonists of HFGAN72X, useful therapeutically. Nucleic acids 

CC that inhibit expression of HFGAN72X and polypeptides that compete with 

CC ligands for binding to HFGAN72X proteins are also useful therapeutically 

CC and diagnostically . HFGAN72X-related diseases include infections 

CC (bacterial, viral, fungal or protozoal, particularly HIV-1 or -2) ; pain; 

CC cancer; anorexia; bulimia; Parkinson's disease; acute heart failure; 

CC hypotension; hypertension; urinary retention; osteoporosis; angina 

CC pectoris; myocardial infarction; ulcers; asthma; allergy; benign 

CC prostatic hypertrophy; anxiety; schizophrenia; manic depression; delirium 

CC ; dementia; severe mental retardation and dyskinesias 
XX 

SQ Sequence 1564 BP; 271 A; 511 C; 435 G; 347 T; 0 U; 0 Other; 



Query Match 99.7%; Score 1274.8; DB 6; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 1.9e-288; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 


i 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 




Db 


154 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


213 


Qy 


61 


T CCCCT GT GCCTCCAGACTATGAAGAT GAGTTT CTCCGCTATCTGT GGCGT GATTAT CTG 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


214 


TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 


273 


Qy 


121 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


274 


TACCCAArtACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGG^C 


333 


Qy 


181 


CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


240 






1 1 1 1 1 1 1 1 1 II II II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 




Db 


334 


CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


393 


Qy 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 






1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


394 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


453 


Qy 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 



Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I f I I I I I I I I I I I I I I I I I 

Db 694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

Qy 601 GT CT GTGAT GAACGCT GGGCAGAT GACCT CTAT CCCAAGATCT ACCACAGTT GCTT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I 
Db 754 GT CT GT GAT GAAC GCT GGGCAGAT GAC CT CT AT CCCAAGAT CT AC CACAGTT GCTT CTTT 813 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I II I I I I I I II I I I I I I I I M I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

Qy 841 CGCGCCTTCCTGGCT GAAGT GAAGCAGAT GC GT GCACGGAGGAAGACAGC CAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 994 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 1053 . 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

- 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 



Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1354 TCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 

Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 9 
ABZ42789 

ID ABZ42789 standard; DNA; 1564 BP. 
XX 

AC ABZ42789; 
XX 

DT 04-MAR-2003 (first entry) 
XX 

DE Human orexin receptor 1 nucleotide SEQ ID NO: 367. 
XX 

KW G protein-coupled receptor; GPCR; antigenic peptide; gene therapy; 

KW G protein-coupled receptor modulator; antibody; immune- related disease; 

KW growth-related disease; cell regeneration-related disease; AIDS; cancer; 

KW immunological-related cell proliferative disease; autoimmune disease; 

KW Alzheimer f s disease; atherosclerosis; infection; osteoarthritis; allergy; 

KW osteoporosis; cardiomyopathy; inflammation; Crohn 1 s disease; diabetes; 

KW graft versus host disease; Parkinson's disease; multiple sclerosis; pain; 

KW psoriasis; anxiety; depression; schizophrenia; dementia; memory loss; 

KW mental retardation; epilepsy; asthma; tuberculosis; obesity; nausea; 

KW hypertension; hypotension; renal disorder; rheumatoid arthritis; trauma; 

KW ulcer; gene; ds . 

XX 

OS Homo sapiens . 
XX 

PN WO200261087-A2. 
XX 

PD 08-AUG-2002. 
XX 

PF 19-DEC-2001; 2001WO-US050107 . 
XX 

PR 19-DEC-2000; 2000US-0257144P. 
XX 

PA (LIFE-) LIFESPAN BIOSCIENCES INC. 
XX 

PI Burmer GC, Roush CL, Brown JP; 

XX . ' 

DR WPI; 2003-046718/04. 

DR P-PSDB; * ABP81941. 

XX 

PT New isolated antigenic peptides e.g., for G protein-coupled receptors 

PT (GPCR) , useful for diagnosing and designing drugs for treating conditions 

PT in which GPCRs are involved, e.g. AIDS, Alzheimer's disease, cancer or 

PT autoimmune diseases. 

XX 

PS Disclosure; Fig 1; 523pp; English. 
XX 

CC The present invention describes antigenic peptides (I) comprising: (a) 



CC any one of 1601 sequences (see ABP82019 to ABP83619) of 12-24 amino 

CC acids. Also described: (1) an assay for the detection of a particular G 

CC protein-coupled receptor (GPCR) or a candidate polypeptide in a sample; 

CC and (2) an isolated antibody having high specificity and high affinity or 

CC avidity for a particular GPCR. (I) can be used as GPCR modulators and in 

CC gene therapy. The antigenic peptides for GPCRs are useful in detecting an 

CC antibody against a particular GPCR, and in the production of specific 

CC antibodies. The peptides and antibodies are also useful for detecting the 

CC presence or absence of corresponding GPCRs. The antigenic peptides for 

CC GPCRs and antibodies are useful for diagnosing and designing drugs for 

CC treating immune-related diseases, growth-related diseases, cell 

CC regeneration-related disease, immunological-related cell proliferative 

CC diseases, or autoimmune diseases, e.g. AIDS, Alzheimer's disease, 

CC atherosclerosis, bacterial, fungal, protozoan or viral infections, 

CC osteoarthritis, osteoporosis, cancer, cardiomyopathy, chronic and acute 

CC inflammation, allergies, Crohn's disease, diabetes, graft versus host 

CC disease, Parkinson's disease, multiple sclerosis, pain, psoriasis, 

CC anxiety, depression, schizophrenia, dementia, mental retardation, memory 

CC loss, epilepsy, asthma, tuberculosis, obesity, nausea, hypertension, 

CC hypotension, renal disorders, rheumatoid arthritis, trauma, ulcers, or 

CC any other disorder in which GPCRs are involved. The antibodies may be 

CC used in immunoassays and immunodiagnosis . ABZ42523 to ABZ42869 encode 

CC GPCR proteins given in ABP81675 to ABP82018, which are used in the 

CC exemplification of the present invention 

XX 

SQ Sequence 1564 BP; 268 A; 513 C; 436 G; 347 T; 0 U; 0 Other; 



Query Match 99.7%; Score 1274.8; DB 7; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 1.9e-288; 

Matches 1276; Conservative . 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 


1 


AT GGAGC C CT CAGC CAC C C CAGGG G C CC AGAT GGGGGT C CC C CCT GGC AGCAGAGAGC C G 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


154 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


213 


Qy 


61 


TCCCCTGTGCCTC CAGACT AT GAAGAT GAGT T T CT C C GCT AT CT GT GGCGT GAT T AT CT G 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


214 


TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 


273 


Qy 


121 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


274 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


333 


Qy 


181 


CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


240 






1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


334 


CT G GT GGGCAACAC GCT GGT CT GC CT GGC CGT GT GGCGGAAC CAC CACAT GAG GACAGT C 


393 


Qy 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 




Db 


394 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


453 


Qy 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 






II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 




Db 


454 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


513 



Qy 



361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 
I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



420 



514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | I I I I I I I I I I I 

574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 

634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

601 GT CT GT GAT GAAC GCT GGGCAGAT GAC CT CTAT CC CAAGAT CT ACCACAGTTGCTT CT TT 660 

I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

754 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 813 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

841 CGC GCCTT CCT GGCT GAAGTGAAGCAGAT GCGT GCAC GGAGGAAGACAGCCAAGATGCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

994 CGCGCCTTCCTGGCT GAAGT GAAGCAGAT GCGT GCAC GGAG GAAGACAGC CAAGAT GCT G 1053 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 
I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I 

1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

1201 TC CT T GCAGAGC C GAT GCT C CGT CTC CAAAAT CT CT GAGC AT GT GGT GCT C AC C AGC GTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I | I I I I I I I I I I I I I I I I I I II I I I 

1354 TCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 



Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 10 
ABI98014 

ID ABI98014 standard; cDNA; 1278 BP. 
XX 

AC ABI98014; 
XX 

DT 18-FEB-2002 (first entry) 
XX 

DE Non-endogenous human GPCR cDNA, SEQ ID NO: 548. 

XX . 

KW Human; G protein-coupled receptor; GPCR; non-endogenous; mutant; 

KW constitutively activated GPCR; agonist; disease; ss. 

XX 

OS Homo sapiens . 

OS Synthetic. 
XX 

PN WO200177172-A2. 
XX 

PD 18-OCT-2001. 
XX 

PF 05-APR-2001; 2001WO-US011098 . 
XX 

PR 07-APR-2000; 2000US-0195747P . 
XX 

PA (AREN-) ARENA PHARM INC. 
XX 

PI Lehmann-Bruinsma K, Liaw CW, Lin I; 

XX 

DR WPI; 2001-648759/74. 

DR P-PSDB; ABB56378. 
XX 

PT Identifying agonists of G protein-coupled receptors (GPCRs) for use in 

PT disease treatment, comprises contacting candidate compounds with versions 

PT of GPCRs. 
XX 

PS Example 2; Page 349-350; 394pp; English. 
XX 

CC The invention relates to G protein-coupled receptors (GPCRs) for which 

CC the endogenous ligand has been identified. Non-endogenous constitutively 

CC activated versions of known GPCRs are used in the invention for the 

CC direct identification of candidate compounds as receptor agonists, 

CC inverse agonists or partial agonists. Such agonists are useful as 

CC therapeutic agents for diseases or disorders associated with GPCRs. The 

CC present sequence encodes a non-endogenous version of a known human GPCR 

XX 

SQ Sequence 1278 BP; 224 A; 423 C; 346 G; 285 T; 0 U; 0 Other; 

Query Match 99.4%; Score 1270; DB 5; Length 1278; 
Best Local Similarity 99.6%; Pred. No. 2.3e-287; 

Matches 1273; Conservative 0; Mismatches 5; Indels 0; Gaps 0; 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

II I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I II I I I I I I I I I I I II I I I I I I I I I 

61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 CT GGTG GGCAACAC GCT GGT CT G C CT GGCC GT GT G GC GGAAC CAC CACAT GAGGACAGT C 240 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I 1 1 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I U I I I I I I I I I I I I I I I I I I 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

c 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
541 GCAGTCATGGAATGCAGCAGTGTGCTGCGTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

601 GT CT GT GAT GAACGCT GGGC AGAT GAC CTCT AT CC CAAGAT CTAC C AC AGTT GCT TCT T T 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

601 GT CT GT GAT GAAC G CTG GGCAGAT GAC CT CT AT C C CAAGAT CTAC CAC AGTT GCT T CT T T 660 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I if I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 7 80 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 



Db 



841 



900 



Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I II I I I I I 1 I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I " 

Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I ! I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGGGCTGCCAACCCCATCATCTACAACTTC 108 0 

Qy 1081 CTCAGTGGCA7\ATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1201 TCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I II I I I I I I I I I I I I I 
Db 1261 ACCACAGTGCTGCCCTGA 1278 



RESULT 11 
AAD09335 

ID AAD09335 standard; cDNA; 1278 BP. 
XX 

AC AAD09335; 
XX 

DT 10-SEP-2001 (first entry) 
XX 

DE Cynomolgous Monkey Orexin 1 Receptor cDNA. 
XX 

KW Cynomolgous monkey; Orexin 1 Receptor; 7 Transmembrane Receptor family; 

KW 7TM; gene therapy; vaccine; microbial infection; HIV-1; HIV-2; pain; 

KW cancer; diabetes; obesity; anorexia; bulimia; urinary retention; 

KW Parkinson's disease; acute heart failure; hypotension; hypertension; 

KW osteoporosis; angina pectoris; myocardial infarction; stroke; ulcer; 

KW asthma; allergy; benign prostatic hypertrophy; migraine; vomiting; 

KW psychotic disorder; neurological disorder; anxiety; schizophrenia; 

KW manic depression; depression; delirium; dementia; mental retardation; 

KW dyskinesia; Huntington's disease; Gilles de la Tourette 1 s syndrome; ss. 

XX 

OS Macaca f ascicularis . 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .1278 

FT /*tag= a 



FT /product^ "Orexin 1 Receptor" 
XX 

PN WO200140259-A2. 
XX 

PD 07-JUN-2001. 
XX 

PF 04-DEC-2000; 2000WO-US032849 . 
XX 

PR 02-DEC-1999; 99US-0168553P . 

PR 28-NOV-2000; 2000US-00723781 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 

PA (SMIK ) SMITHKLINE BEECHAM PLC. 
XX 

PI Ellis CE; 
XX 

DR WPI; 2001-408276/43. 

DR P-PSDB; AAE04740. 
XX 

PT Novel Cynomolgous Monkey Orexin 1 Receptor polypeptides, for treating 

PT infections, pain, cancer, diabetes, obesity, asthma, schizophrenia, 

PT hypertension, urinary retention, Parkinson's disease and stroke. 
XX 

PS Claim 1; Page 28; 33pp; English. 
XX 

CC The present sequence is a cDNA encoding Cynomolgous Monkey Orexin 1 

CC Receptor which is structurally related to members of 7 Transmembrane 

CC Receptor (7TM) family. The Orexin 1 Receptor polypeptide and 

CC polynucleotide are useful for treating bacterial, fungal, protozoan and 

CC viral infections, particularly infections caused by HIV-1 or HIV-2, pain, 

CC cancer, diabetes, obesity, anorexia, bulimia, Parkinson's disease, acute 

CC heart failure, hypotension, hypertension, urinary retention, 

CC osteoporosis, angina pectoris, myocardial infarction, stroke, ulcers, 

CC asthma, allergies, benign prostatic hypertrophy, migraine, vomiting, 

CC psychotic and neurological disorders including anxiety, schizophrenia, 

CC manic depression, depression, delirium, dementia and severe mental 

CC retardation, and dyskinesias, such as Huntington's disease or Gilles de 

CC la Tourette's syndrome. The polypeptide is also useful for structure- 

CC based design of its agonist, antagonist or inhibitor. The polynucleotide 

CC is useful for chromosome localisation studies and in gene therapy. The 

CC Orexin 1 Receptor polypeptide and polynucleotide are also useful as 

CC vaccines 

XX 

SQ Sequence 1278 BP; 219 A; 433 C; 346 G; 280 T; 0 U; 0 Other; 



Query Match 95.9%; Score 1225.2; DB 4,\ -Length 1278; 

Best Local Similarity 97.4%; Pred. No. 7.1e-277; 

Matches 1245; Conservative 0; Mismatches 33; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGCGGGTCCCCACTGGCAGCAGGGAGCCA 60 



Qy 

Db 



61 T CC CCT GT GC CTC CAGACT AT GAAGAT GAGT TT CT CCG CT AT CT GT GGCGT GAT TAT CT G 

1 1 1 1 1 I I I I i I 1 1 1 1 1 1 1 1 I 1 1 I I I I I 1 1 I I I I I I 1 1 I I I I I I I I I I I I I I I I I I II 

61 TCCCCTGTGCCTCCAGACTATGAAGACGAGTTTCTCCGCTACCTGTGGCGCGATTATCTG 



120 
120 



121 T AC C C AAAAC AGT AT G AGT G GGT C CT CAT C G C AG C CT AT GTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I II I I 1 I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I i I 
121 TACCCAAAACAGTACGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCCTCGTGGCC 180 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

241 ACCAACTACTTCATCGTCAACCTGTCCCTGGCTGACGTTCTGGTAACTGCCATCTGCCTG 300 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 CCGGTCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCTCTCTGCAAG 360 

3 61 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I 

361 GTCATCCCCTATCTACAGGCCGTGTCCGTGTCAGTGGCAGTGCTGACTCTCAGCTTCATC 420 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGC.CCCAGGC.T 540 

I I I I I I I I I' I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCTGTCATGGTGCCCCAGGCT 540 

541 GCAGT CAT GGAAT GCAGCAGT GT GCT GC CT GAGCT AGC CAAC C GCACAC GGCT CTT CT C A 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I 
541 GCAGTCATGGAATGCAGCAGTGTGCTGCCCGAGCTAGCCAACCGCACACGGCTCTTCTCG 600 

601 GT CTGT GAT GAACGCT GGGCAGATGACCTCTATCCCAAGATCT ACCACAGTTGCTT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I II I I I I I 
601 GT CT GT GAT GAACGCT GGGCAGAT GAC CT AT AT CC CAAGAT CT ACC AC AGT TGCTTCTTC 660 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
721 AAGCTCTGGGGCCGCCAGATTCCCGGCACCACCTCAGCACTGGTGCGAAACTGGAAGCGC 780 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I ITI I II I I II I I I I I I I I I I I I I I 

7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGACAGCCCCAGCCCCGGGCC 840 

8 41 CGCGCCTTCCT GGCT GAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 CGCGCCTTCCT GGCT GAAGT GAAG C AGAT GC GT GC GC G GAGGAAGACAGCCAAGAT G CT G 900 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

901 ATGGTGGTGCTGCTGGTCTTTGCCCTCTGCTACCTGCCCATCAGTGTCCTCAATGTCCTT 960 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



961 



I I I I I I I I I I I II II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 



1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGTGCTGCCAACCCCATCATCTACAACTTC 1080 

1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCCG 1140 

1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
1141 GGCCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 12 00 

1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAACTCTCTGAGCACGTGGTGCTCACCAGCGTC 1260 

1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 

1261 ACCACAGTGCTGCCCTGA 1278 



AAT42826; 

22-FEB-1997 (first entry) 
Neuropeptide receptor gene. 

Human; neuropeptide receptor; drug screening; receptor-agonist; 
receptor-antagonist; anorectic; antitumour; anticholesterolemic; 
neuroprotective; anticonvulsant; hypotensive; sedative; diagnostic; 
gene therapy; ss. 

Homo sapiens. 



RESULT 12 
AAT42826 

ID AAT42826 standard; cDNA; 1209 BP. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 



Key 

primer_bind 



Location/Qualifiers 
complement ( 1 . .18) 
/*tag= a 

/note= "Binds primer AAT42829" 
misc_dif f erence 151. .153 
/*tag= b 

/codon= seqrCCA, aa:Ala 
complement (1190. . 1192) 
/*tag= c 

/note= "Binds primers AAT42830 and AAT42832" 



prime r_bind 



W09634877-A1. 

07-NOV-1996. 

05-MAY-1995; 



95WO-US005616. 



XX 

PR 05-MAY-1995; 95WO-US005616 . 
XX 

PA (HUMA-) HUMAN GENOME SCI INC. 
XX 

PI Soppet DR, Li Y, Rosen CA; 
XX 

DR WPI; 1996-506094/50. 

DR P-PSDB; AAW06124. 
XX 

PT Human neuro-peptide receptor polypeptide ( s ) - used to identify 

PT antagonists and agonists to such polypeptide ( s ) , e.g. in the treatment of 

PT obesity, Alzheimer's disease, epilepsy, etc. 

XX 

PS Claim 6; Page 48-49; 77pp; English. 
XX 

CC The sequence encodes a human neuropeptide receptor, and has been mapped 

CC to human chromosome lq31-34. The sequence has been isolated from a human 

CC adult hypothalamus c'DNA library, and is structurally related to the G- 

CC protein-coupled receptor family. Splice variants are given in AAT42827- 

CC 28. The sequence may be amplified by PCR with e.g. primers AAT42829-34 

CC for expression in a host cell. The recombinant receptor may be used in a 

CC drug screening assay for isolation of receptor-agonists and -antagonists, 

CC which may be used as anorectic, antitumour, anticholesterolemic, 

CC neuroprotective, anticonvulsant, hypotensive or sedative drugs, etc. The 

CC DNA may also be used in genetic disease diagnosis or gene therapy. The' 

CC receptor and its corresponding antibody may also be used in therapy and 

CC diagnosis 
XX 

SQ Sequence 1209 BP; 206 A; 402 C; 330 G; 271 T; 0 U; 0 Other; 

Query Match 94.1%; Score 1202.6; DB 2; Length 1209; 

Best Local Similarity 99.7%; Pred. No. 1.4e-271; 

Matches 1205; Conservative 0; Mismatches 4; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG .60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I 

Db 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 T AC C CAAAAC AGTAT GAGT GGGTC CT CAT CC CAGC CT AT GT GGCT GTGT T C GT C GT GGG C'; 1 8 0 

Qy 181 CT GGTGG GCAACAC GCT GGT CT GC C T GGCCGT GT G GC GGAAC CAC CAC AT GAGGACAGT C 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I | | | | | | I I I I I I I I I I I I I 1.1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 



Qy 



301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 
| I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 



Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I 

Db 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 GC C CT GGAC C GCT GGT ATGC CAT CT G C CAC C CACT ATT GT T CAAGAG CAC AGC C C GG C GG 480 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

Qy 601 GT CT GT GAT GAAC GCT G GGC AGAT GAC CT CT AT C C CAAGAT CTAC CAC AGTTGCTT CT TT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 GT CT GT CAT GAAC GCT G GGCAGAT GAC CTCT AT C C CAAGAT CTAC CACAGTT GCT TCT TT 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 C GC GCCT T CCT GGCTGAAGT GAAG CAGAT GC GT GCAC GGAG GAAGAC AGC CAAGATGCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 841 C GCGCCTT CCT GGCTGAAGT GAAG CAGAT GC GT GCAC GGAG GAAGAC AGC CAAGATGCT G 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 



Qy 1201 TCCTTGCAG 1209 

I I I I I I II 
Db 1201 TCCTTGTAG 1209 



RESULT 13 
AAV68512 

ID AAV68512 standard; cDNA; 1133 BP. 
XX 

AC AAV68512; 
XX 

DT 29-JAN-1999 (first entry) 
XX 

DE Nucleotide sequence of HGS EST 557082. 
XX' 

KW HGS EST 557082; G-protein coupled receptor family; HFGAN72Y; mutation; 

KW probe; agonist; antagonist; activation; inhibition; gene therapy; 

KW antibody; immune response; vaccine; HIV-1; HIV-2; cancer; anorexia; 

KW bulimia; asthma; Parkinson's disease; acute heart failure; hypotension; 

KW hypertension; urinary retention; osteoporosis; angina pectoris; 

KW myocardial infarction; ulcer; allergies; psychotic disorder; 

KW neurological disorder; gene mapping; ss. 

XX 

OS Homo sapiens . 
XX 

PN EP875565-A2. 
XX 

PD 04-NOV-1998. 
XX 

PF 27-OCT-1997; 97EP-00308554 . 
XX 

PR 30-APR-1997; 97US-00846705 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 
XX 

PI Bergsma DJ, Ellis C; 
XX 

DR WPI; 1998-570286/49. 
XX 

PT New G-protein coupled receptor HFGAN72Y polypeptide and polynucleotide - 

PT useful as diagnostic reagents and for prevention and treatment of HIV 

PT infections, cancer, osteoporosis and Parkinson's disease. 
XX 

PS Example 1; Page 18-19; 22pp; English. 

XX -,. " 

CC This is the nucleotide sequence of the HGS EST 557082 used in the method 

CC of the invention involving the G-protein coupled receptor, HFGAN72Y. Its 

CC polypeptides and polynucleotides are useful for diagnosing susceptibility 

CC to diseases by detecting mutations in the HFGAN72Y gene using probes 

CC containing the HFGAN72Y nucleotide sequence, and can diagnose diseases 

CC associated with HFGAN72Y imbalance by determining HFGAN72Y polypeptide or 

CC mRNA expression levels. Agonists/antagonists can be used in treatment to 

CC activate/inhibit HFGAN72Y activity, in addition to direct administration 

CC of antisense sequences to prevent expression, or HFGAN72Y polypeptides to 

CC treat conditions associated with a lack HFGAN72Y protein. Gene therapy 

CC may also be used to affect endogenous HFGAN72Y polypeptide production. 



CC HFGAN72Y antibodies are useful for inducing an immune response to 

CC immunise and prevent diseases , and for isolating HFGAN72Y clones or 

CC purifying the polypeptides by affinity chromatography. HFGAN72Y 

CC polypeptides can be administered directly or as a vaccine to inoculate 

CC against diseases. Diseases diagnosed, prevented or treated include HIV-1 

CC or HIV-2 infections, pain, cancers, anorexia, bulimia, asthma, 

CC Parkinson's disease, acute heart failure, hypotension, hypertension, 

CC urinary retention, osteoporosis, angina pectoris, myocardial infarction, 

CC ulcers; allergies, benign prostatic hypertrophy, and psychotic and 

CC neurological disorders. The HFGAN72Y polypeptide is also useful for 

CC mapping the gene to a chromosome, allowing gene inheritance to be studied 

CC through linkage analysis 

XX 

SQ Sequence 1133 BP; 202 A; 366 C; 314 G; 251 T; 0 U; 0 Other; 



Query Match 85.0%; Score 1086.4; DB 2; Length 1133; 

Best Local Similarity 99.9%; Pred. No. " 2 . le-244 ; 

Matches 1087; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 



Qy 


l 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


60 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 I 1 1 1 1 1 1 




Db 


l 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


60 


Qy 


61 


T CC C CT GT GC CT C CAGACTAT GAAGAT GAGT TT CT CC GCT AT CT GT GGCGT GAT T AT CT G 


120 




I | | | | | | || 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 M M 




Db 


61 


TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 


120 


Qy 


121 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


lo(J 






■ iti iii i i i i i i i i i i i i i i t i i i i i i r i \ l 1 i 1 1 1 1 1 1 1 1 L \ t 1 \ I 1 1 1 1 1 1 1 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 


Qy 


1 Q1 




240 




| | | | | | | | | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


240 


Qy 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 


Qy 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 It 1 1 1 1 1 




Db 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 


Qy 


361 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


420 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


,3bi 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCiCAGCTTCATC 


420 


Qy 


421 


GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


480 




1 || I I I || I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


421 


GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


480 


Qy 


481 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 




| | | | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 


Qy 


541 


GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 


600 



1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 



Db 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 



600 



Qy 601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 GT CTGT GAT GAACGCTGGGCAGATGACCT CTAT CCCAAGATCTACCACAGTT GCTTCTTT 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCAGCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I II i I I I 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I II I 1.1 I I I I I I I I I I I I I I 

Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCT GAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 CGCGCCTTCCTGGCT GAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCTG 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGG GT GT T C GGGAT GT T C C GC CAAGC C AGT GAC C GC GAAGCT GT CT AC GC CT G CT T C 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGG 1088 

II II I I I I 

Db 1081 CTCAGTGG 108 8 



RESULT 14 
ABA96020 

ID ABA96020 standard; cDNA; 1133 BP. 
.XX 

AC ABA96020; 
XX 

DT 12-MAR-2002 (rirst entry) ... 
XX 

DE HGS EST 557082. 
XX 

KW G-protein; receptor; HFGAN72Y; cytostatic; cardiant; analgesic; cancer; 

KW nootropic; tranquillising; neuroprotective; anti-asthmatic; gene therapy; 

KW infection; HIV-1; pain; anorexia; bulimia; Parkinson's disease; ulcer; 

KW cardiac disease; urinary retention; asthma; allergy; psychotic disorder; 

KW benign prostatic hypertrophy; neurological disorder; anxiety; delirium; 

KW schizophrenia; manic depression; dementia; mental retardation; EST; 

KW dyskinesia; Huntington's disease; Tourette 1 s syndrome; HIV-2; 

KW HGS EST 557082; expressed sequence tag; ss. 



XX 

OS Homo sapiens. 
XX 

PN EP1156110-A2. 
XX 

PD 21-NOV-2001. 
XX 

PF 27-OCT-1997; 2001EP-00203010 . 
XX 

PR 30-APR-1997; 97US-00846705 . 

PR 27-OCT-1997; 97EP-00308554 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM CORP. 
XX 

PI Bergsma DJ, Ellis CE; 

XX 

DR WPI; 2002-084320/12. 
XX 

PT New polynucleotide encoding a G-protein coupled receptor designated 

PT HFGAN72Y is useful to diagnose and treat associated diseases including 

PT cancer, infection, cardiac disease and psychotic and neurological 

PT disorders. 
XX 

PS Example 1; Page 18-19; 22pp; English. 
XX 

CC The sequence represents HGS EST 557082. The invention relates to a novel 

CC isolated polynucleotide encoding HFGAN72Y polypeptide. The polypeptide of 

CC the invention has cytostatic, cardiant, analgesic, tranquillising, 

CC nootropic, neuroprotective, and anti-asthmatic activity. The HFGAN72Y has 

CC a use in gene therapy. The HFGAN72Y polynucleotide or an HFGAN72Y 

CC polypeptide agonist are used to treat a subject in need of enhanced 

CC HFGAN72Y activity or expression. An HFGAN72Y antagonist or competitor, or 

CC nucleic acid which inhibits HFGAN72Y expression is used to treat a 

CC subject in need of decreased HFGAN72Y activity or expression. HFGAN72Y- 

CC associated diseases include infections, particularly by HIV-1 or HIV-2, 

CC pain, anorexia, bulimia, Parkinson's disease, cardiac diseases, cancers, 

CC ulcers, urinary retention, asthma, allergies, benign prostatic 

CC hypertrophy, and psychotic and neurological disorders including anxiety, 

CC schizophrenia, manic depression, delirium, dementia, severe mental 

CC retardation and dyskinesias such as Huntington's disease and Tourette 1 s 

CC syndrome 

XX 

SQ Sequence 1133 BP; 202 A; 366 C; 314 G; 251 T; 0 U; 0 Other; 

Query Match 85.0%; Score 1086.4; DB 6; Length 1133; 
Best Local Similarity 99.9%; ..Pred. No. 2.1e-244; 

Matches 1087; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 




Db 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 



Qy 



Db 



61 T CC C CT GT GC CT C CAGACT AT GAAGAT GAGTTT CT C C GCT AT CT GT GGCGT GAT TAT CT G 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
61 TCCCCTGTGCCTC CAGACT AT GAAGAT GAGT TT CT C C GCT AT CT GTGGCGT GATTAT CT G 120 



Qy 121 TACCCAAAACAGTAT GAGTGGGT C CT CAT CGCAGCCT AT GT GGCT GT GTT CGT CGTGGCC 180 



121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I N I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

1 1 1 1 1 1 1 1 1 1 ii i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 8 0 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

601 GT CT GT GAT GAAC GCT G GGC AGAT GAC CT CT AT CG CAAGAT CT AC CAC AGT TGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

601 GTCT GT GAT GAACGCT GGGCAGAT GAC CT CTAT CCCAAGAT CTACCACAGTT GCTT CTTT 660 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 
I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

781 C C CT C AGAC CAGCTGGGGGACCT GGAGCAG GGC CTGAGT GGAGAG C C CCAGC C C CGGGGC 84 0 

841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGG 1088 

I I I I I I I I 
Db 1081 CTCAGTGG 108 8 



RESULT 15 




AAV68511 




ID 


AAV68511 standard; cDNA; -1170 BP. 


XX 






AC 


AAV68511; 




XX 






DT 


29-JAN-1999 


(first entry) 


XX 






DE 


Nucleotide sequence of HFGAN72Y a G-protein coupled receptor. 


XX 






KW 


G-protein coupled receptor family; HFGAN72Y; mutation; probe; agonist; 


KW 


antagonist; activation; inhibition; gene therapy; antibody; 


KW 


immune response; vaccine; HIV-1; HIV-2; cancer; anorexia; bulimia; 


KW 


asthma; Parkinson's disease; acute heart failure; hypotension; 


KW 


hypertension; 


urinary retention; osteoporosis; angina pectoris; 


KW 


myocardial infarction; ulcer; allergies; psychotic disorder; 


KW 


neurological 


disorder; gene mapping; ss. 


XX 






OS 


Homo sapiens. 




XX 






FH 


Key 


Location/Qualifiers 


FT 


CDS 


1. .1170 


FT 




/*tag= a 


FT 




/product= "HFGAN72Y protein" 


XX 






PN 


EP875565-A2. 




w 

AA 






PD 


04-NOV-1998. 




XX 






PF 


27-OCT-1997; 


97EP-00308554. 


XX 






PR 


30-APR-1997; 


97US-00846705. 


XX 






PA 


(SMIK ) SMITHKLINE BEECHAM CORP. 


XX 






PI 


Bergsma DJ, 


Ellis C; 


XX 






DR 


WPI; 1998-570286/49. 


DR 


P-PSDB; AAW80805. 


XX 






PT 


New G-protein 


coupled receptor HFGAN72Y polypeptide and polynucleotide - 


PT 


useful as diagnostic reagents and for prevention and treatment of HIV 


PT 


infections, cancer, osteoporosis and Parkinson's disease. 


XX 






PS 


Claim 3; Page 


7; 22pp; English. 


XX 







CC This is the nucleotide sequence of the G-protein coupled receptor, 

CC HFGAN72Y used in the method of the invention. HFGAN72Y polypeptides and ... 

CC polynucleotides are useful for diagnosing susceptibility to diseases by 

CC detecting mutations in the HFGAN72Y gene using probes containing the 

CC HFGAN72Y nucleotide sequence, and can diagnose diseases associated with 

CC HFGAN72Y imbalance by determining HFGAN72Y polypeptide or mRNA expression 

CC levels. Agonists/antagonists can be used in treatment to activate/inhibit 

CC HFGAN72Y activity, in addition to direct administration of antisense 

CC sequences to prevent expression, or HFGAN72Y polypeptides to treat 

CC conditi ons associated with a lack HFGAN72Y protein. Gene therapy may also 

CC be used to affect endogenous HFGAN72Y polypeptide production. HFGAN72Y 

CC antibodies are useful for inducing an immune response to immunise and 

CC prevent diseases, and for isolating HFGAN72Y clones or purifying the 

CC polypeptides by affinity chromatography. HFGAN72Y polypeptides can be 

CC administered directly or as a vaccine to inoculate against diseases. 

CC Diseases diagnosed, prevented or treated include HIV-1 or HIV-2 

CC infections, pain, cancers, anorexia, bulimia, asthma, Parkinson's 

CC disease, acute heart failure, hypotension, hypertension, urinary 

CC retention, osteoporosis, angina pectoris, myocardial infarction, ulcers; 

CC allergies, benign prostatic hypertrophy, and psychotic and neurological 

CC disorders. The HFGAN72Y polypeptide is also useful for mapping the gene 

CC to a chromosome, allowing gene inheritance to be studied through linkage 

CC analysis 

XX 

SQ Sequence 1170 BP; 208 A; 381 C; 322 G; 259 T; 0 U; 0 Other; 

Query Match 85.0%; Score 1086.4; DB 2; Length 1170; 

Best Local Similarity 99.9%; Pred. No. 2.1e-244; 

Matches 1087; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTC C AGACT AT GAAGAT GAGTT T CT C CGCT AT CT GT GGC GT GATT AT CT G 120 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I 
Db 61 TCCCCTGTGCCTC CAGACT AT GAAGAT GAGTT T CT C C GCT AT CT GT GGC GT GATT AT CT G 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 



Qy 

Db 



361 
361 



GT CAT CCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTT CATC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I I 

GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 



420 
420 



Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GCAGT CAT G GAAT GC AGCAGT GT GCT GC C TGAGCT AGC CAAC CGCACACGGCT CT T CT C A 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 541 G C AGT CAT G GAAT GCAGC AGT GT GCT G C CT GAG C T AGC CAAC C GCACACG GC T CT T CTC A 600 

Qy 601 GT CT GTGAT GAACGCTGGGCAGAT GACCT CTATCCCAAGAT CTACCACAGTT GCTT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 GT CT GT GAT GAACGCT GGGCAGAT GAC C T CT AT C CCAAGAT CT ACC ACAGT TGCTTCTTT 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I 

Db 841 CGCGCCTTCCT GGCT GAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCTG 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I 
Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTC AGT GG 1088. 'T 

I I I I I I I I 
Db 1081 CTCAGTGG 1088 



Search completed: October 15, 2004, 16:01:42 
Job time : 550.899 sees 



GenCore version 5.1.6 
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Title: 
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682709 seqs, 277475446 residues 
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Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 
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score greater than or equal to the score of the result being printed, 
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ALIGNMENTS 



RESULT 1 
US-08-846-705-4 

; Sequence 4, Application US/08846705 
; Patent No. 5935814 
; GENERAL INFORMATION: 

APPLICANT: BERGSMA, DERK J. 

APPLICANT: ELLIS, CATHERINE E 

TITLE OF INVENTION: NOVEL G-PROTEIN COUPLED 

NUMBER OF SEQUENCES: 5 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: . RATNER & PRESTIA 
STREET: P.O. BOX 980 
CITY: VALLEY FORGE 
STATE: PA 
COUNTRY: USA 
ZIP: 19482 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Diskette 



COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/846, 705 
FILING DATE: 30-APR-1997 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
APPLICATION NUMBER: 
FILING DATE: 
ATTORNEY/ AGENT INFORMATION: 
NAME: PRESTIA, PAUL F 
REGISTRATION NUMBER: 23,031 
REFERENCE/ DOCKET NUMBER: GH-70003 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 610-407-0700 
TELEFAX: 610-407-0701 
TELEX: 846169 
INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1564 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
US-08-846-705-4 

Query Match 99.7%; Score 127 4.8; DB 2; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 3.2e-287; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 214 T CCCCT GT GCCT CCAGACT AT GAAGAT GAGTTT CT CCGCT AT CT GTGGCGT GATT AT CT G 273 

Qy 121 T AC C CAAAAC AGT AT GAGT GG GT C CT CAT C GCAGC CT AT GT GGCT GT GT T C GT C GT GGC C 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 274 TACCCAA7VACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db : 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 



Qy 



361 



GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



420 



Db 


514 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTT^CTCTCAGCTTCATC 


573 


Qy 


A O 1 




480 




1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 




Db 


574 


GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


633 


Qy 


A O 1 

4ol 


rprr , rmrrrmpr , 7\mr , r , a i rrrrATr , T i rrrrTr r T i r'TprrTrrr'rziTr , Zi r rrnT^rrrrArifirT 
GCCCG1 GGC 1 CCAl CLl GGbCAl C 1 OCjCjUI blbl L-biUX LiicrUUAl UAi vjljl ij^\^^^.tt.u^^± 


540 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 > 1 1 




Db 


634 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


693 


Qy 


541 


GCAGTCaTGGAATGCAGCAG 1 G i GC 1 CjUU 1 (jACjU 1 AbjL.UAAOUbjUAUAUbjbjU 1 ^ 1 ibi 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


694 


GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 


753 


Qy 


601 


GTCTGTGATGAACGCTGGGLAGAlGALClClAiCULAA(jA101AUUAUAbjl li^Ui iblll 


DDU 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 ri 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


754 


GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 


813 


Qy 


661 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCA1 GGLulAl 1 1 LLAbAl Al 


f C.\J 




i i i i i i i i i i i i i i i m i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 




Db 


814 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 


873 


Qy 


721 


AAGCTCTGGGGCCGCCAGAI CCCCGGCAL.LA.UU1 CACCAC! b-Lil l^CCCAALI ul*/\/\IjULjL, 


7 ft 0 
/ ou 




i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


874 


AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 


933 


Qy 


781 


CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCC 1 biAbr! CCACACCCCCACCCCCLxijLrCL 


p & n 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 




Db 


934 


CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 


993 


Qy 


841 


CGCGCCTTCCT GGCTGAAGTGAAGCAGA1 GCCl CLACCCACCAACACAbCCAALrAi bbib 


qnn 

37 U U 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


994 


CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 


1053 


Qy 


901 


ATGGTGGTGCTGCTGGTCTTCGCCC I CTGCTACC 1 GCCCA1 CACCC 1 LL 1 LAA1 C 1 Lb 1 1 


z) DU 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1054 


ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 


1113 


Qy 


961 


AAGAGGGTGTTCGGGAI GI 1 CCGCCAACCCAC1 CACCLrCLrAALiCi CI ClACbjCLiLrLi il 


x. \J c. \) 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1114 


AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 


1173 


Qy 


102 1 


Tv/^r«mmprnpr«r , A r*tT\r*r*r*rnrT*rnr > rTi ArTTfA a C A C (~* C r*T f* C C 'A A^ , ^*^*l^AT'^*AT , f^T , A^ , A.A.pT ,^ P("' , 
ACCT J. C 1 bbbAb X CCC 1 CC I bi ACCCCAACACCCC 1 CCLAALLLLA1 bAl Ul/\wv\U± 1 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1174 


ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 


1233 


Qy 


1081 


CTCAGTGGCAAAT I CCGGGAGCAG1 1 1 AACCCl bbbl Ibi CCI bbl bbbl CLLl laL-uunj 


114 0 

X J. " u 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1234 


CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 


1293 


Qy 


11/11 
1141 


CGI Lbbl GCGCC lLlLl CAACCCCCC 1 AC1 CCCCCC IbblLl LrLLA*jLLAU/\/\ol U^ J. J. \j 


12 00 




I I I I I I I I I I 1 1 1 1 I 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 

1 1 11 1 1 1 II II 1 1 1 1 II I 1 II 1 II 1 1 M M 1 1 M 1 1 1 ! 1 1 1 1 1 1 1 1 II 1 1 1! 1 II II II 1 




Db 


1294 


GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 


1353 


Qy 


1201 


T C CTT GCAGAGC C GAT GCT CCGT CT C CAAAAT CT CT GAGCAT GT GGT GCT CAC CAGC GT C 


1260 




1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 II 1 1 M 1 1 1 M 1 II 1 1 1 II 1 1 II 1 1 1 1 M 1 1 M 1 1 M II 




Db 


1354 


TCCTTGTAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 


1413 



Qy 



1261 AC C ACAGT GCT G C CCT GA 1278 




Db 



1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 2 
US-08-846-704-1 

; Sequence 1, Application US/08846704 

; Patent No. 6020157 

; GENERAL INFORMATION: 

APPLICANT: BERGSMA, DERK J. 
APPLICANT: ELLIS, CATHERINE E . 
; TITLE OF INVENTION: NOVEL G-PROTEIN COUPLED 
NUMBER OF SEQUENCES: 4 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: RATNER & PRESTIA 
STREET: P.O. BOX 980 
CITY: VALLEY FORGE 
; STATE: PA 

; COUNTRY: USA 

ZIP: 19482 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ for Windows Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 84 6, 7 04 
FILING DATE: 30-APR-1997 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 
; k APPLICATION NUMBER: 

FILING DATE: 
; ATTORNEY/AGENT INFORMATION: 
; NAME: PRESTIA, PAUL F 

REGISTRATION NUMBER: 23,031 
REFERENCE/DOCKET NUMBER: GH-70002 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 610-407-0700 
TELEFAX: 610-407-0701 
TELEX: 84 6169 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1564 base pairs 
; TYPE: nucleic acid" 

STRANDEDNESS: single 
; TOPOLOGY: linear 

; MOLECULE TYPE: cDNA 
US-08-846-704-1 



Best Local Similarity 99.8%; Pred. No. 3.2e-287; 



Query Match 



99.7%; 



Score 1274.8; DB 3; Length 1564; 



Matches 1276; Conservative 



0; Mismatches 



2; 



Indels 



0; Gaps 



0 



Qy 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 



154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I 

214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 273 

121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

27 4 - TACCC7U\AACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I i I 1. 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I II I I I II I I I I I I I I I I I I I I I 

634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

694 GCAGTCATGG7VA.TGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

754 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCT^AGATCTACCACAGTTGCTTCTTT 813 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II 

814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

i i i i i i i i i i i i i.i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 

874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I. I I I I I I I I I II I 

934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

841 CGCGCCTTCCTGGCT GAAGT G AAGC AGAT GCGT GCAC GGAGGAAGACAGCCAAGAT GCT G 900 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

994 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 1053 



Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 12 01 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1354 TCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 



12 61 ACCACAGTGCTGCCCTGA 127 8 

I I I I I I I I I I I I I I I I I I 
1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 3 
US-08-846-704-3 

; Sequence 3, Application US/08846704 
; Patent No. 6020157 
; GENERAL INFORMATION: 

APPLICANT: BERGSMA, DERK J. 

APPLICANT: ELLIS, CATHERINE E. 

TITLE OF INVENTION: NOVEL G- PROTEIN COUPLED 

NUMBER OF SEQUENCES: 4 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: RATNER & PRESTIA 

STREET: P.O. BOX 980 

CITY: VALLEY FORGE 

STATE: PA 

COUNTRY: USA 

ZIP: 19482 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER : US/ 0 8 / 8 46, 7 04 

FILING DATE: 30-APR-1997 

CLASSIFICATION: 435 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 



Qy 

Db 



FILING DATE: 
ATTORNEY/AGENT INFORMATION: 
NAME: PRESTIA, PAUL F 
REGISTRATION NUMBER: 23, 031 
REFERENCE/ DOCKET NUMBER: GH-70002 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 610-407-0700 
TELEFAX: 610-407-0701 
TELEX: 846169 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1564 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
US-08-846-704-3 

Query Match 99.7%; Score 1274.8; DB 3; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 3.2e-287; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 
Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I I I I I I 
Db 214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 273 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 2 4 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | | | 
Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I II I I I II I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 



Qy - 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 



Qy 601 GT CT GT GAT GAACGCT GGG CAGAT GAC C TCTAT C CCAAGAT CT AC CACAGTT GCT T CT TT 660 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 754 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 813 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 874 AAGCT CT GGGGCCGCCAGAT CC CCGGCACCACCT C AGCACT GGT GCGGAACTGGAAGC GC 933 



Qy 7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 



Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 994 CGC GCCT T CCT GGCT GAAGT GAAGCAGAT GC GT GCAC GGAGGAAGAC AGC CAAGAT GCT G 1053 

Qy * 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 117 3 



Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 



Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I II I II II II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 1201 TCCTTGCRGAGCCGAT GCT CCGT CTCCAAAAT CT CT GAGCAT GT GGT GCT CACCAGCGT C 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1354 TCCTTGTAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 



Qy 1261 ACCACAGTGCTGCCCTGA 127 8 

I I I I I I I I I I I I I I I I I I 

Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 4 

US-08-462-509B-1 

; Sequence 1, Application US/08462509B 



; Patent No. 6410701 

; GENERAL INFORMATION: 

; APPLICANT: Soppet, Daniel et al 

; TITLE OF INVENTION: Human Neuropeptide Receptor 

NUMBER OF SEQUENCES: 12 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Human Genome Sciences, Inc. 

STREET: 9410 Key West Avenue 
; CITY: Rockiville 

STATE: MD 

COUNTRY: USA 
; f ZIP: 20850 

; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: , 

APPLICATION NUMBER: US/ 08/4 62 , 509B 
FILING DATE: 05-JUN-1995 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/US95/ 05616 
FILING DATE: 05-MAY-1995 
ATTORNEY/AGENT INFORMATION: 
; NAME: Wales, Michele M. 

REGISTRATION NUMBER: 43,975 
; REFERENCE/ DOCKET NUMBER: PF168P1 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 301-309-8504 
TELEFAX: 301-309-8439 
; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1209 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 1..1209 
US-08-462-509B-1 

Query Match 94.4%; Score 1205.8; DB 4; Length 1209; 

Best Local Similarity 99.8%; Pred. No. 3.2e-271; 
./■Matches 1207; Conservative 0; Mismatches 2; ,: Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 in 1 1 1 1 1 1 1 1 1 1 1 1 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I II Ml I II II Mill IT I I I HII II I II I I I III I II I Ml II IT I II I I II Ml " 

Db 61 T C C CCT GT GC CT C CAGACT AT GAAGAT GAGTTT CT CC GCT AT CT GT GGCGT GAT TAT CTG 120 



Qy 



121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I II I II I I II II II II I I I I I II I II I I I II I I I I I I II II II II I II I I I I I I I II M I 



121 T ACC CAAAACAGTAT GAGT GGGT C CT C AT CGCAGC CT AT GT G GCT GT GT T C GT CGT GG C C 180 

181 CT GGT GG G CAACAC GCT G GT CT GC CT GGCC GT GT G GCG GAAC CAC CACAT GAGGACAGT C 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG '480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

541 GCAGTCATGGAAT GCAGCAGT GT GCTGCCT GAGCTAGCCAACCGCACACGGCT CTTCT CA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

601 GT CT GT GAT GAAC GCT GGGCAGAT GAC CT CT AT C C CAAGAT C T AC CAC AGT T GCT T CT T T 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
601 GTCT GT GAT GAAC G CTGGGCAGAT GAC CT CT AT C C CAAGAT CT AC CACAGT T GCT T CT T T 660 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
661 ATT GTCAC CT ACCT GGC C C CACT GG GC CT C AT GGC CAT GGC CTAT T T C CAGAT AT T C C GC 720 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 8 40 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 i 

7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

90 1 AT GGT GGT GCT GCTGGTCTTC GC C CT CT GCT AC CT GC C CAT CAGCGT CCT CAAT GTCCTT "960 



961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 



Qy 

Db 



1021 
1021 



ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 

I I I I I I I I I I I I I 1.1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 



1080 
1080 



Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

Qy 1201 TCCTTGCAG 1209 

I I I I I I II 

Db 1201 TCCTTGTAG 12 09 



RESULT 5 

PCT-US95-05616-1 

; Sequence 1, Application PC/TUS9505616 
; GENERAL INFORMATION: 

APPLICANT: LI, ET AL. 
; TITLE OF INVENTION: Human Neuropeptide Receptor 
; • NUMBER OF SEQUENCES : 12 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: CARELLA, BYRNE, BAIN, GILFILLAN, 
; ADDRESSEE: CECCHI , STEWART & OLSTEIN 

STREET: 6 BECKER FARM ROAD 

CITY: ROSELAND 

STATE: NEW JERSEY 

COUNTRY: USA 

ZIP: 07068 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5 INCH DISKETTE 

COMPUTER: IBM PS/2 

OPERATING SYSTEM: MS-DOS 

SOFTWARE: WORD PERFECT 5.1 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US95/ 05616 
; FILING DATE: concurrently 

CLASSIFICATION: 
; ATTORNEY/AGENT INFORMATION: 
; NAME: FERRARO, GREGORY D. 

; REGISTRATION NUMBER: 36,134 

REFERENCE/ DOCKET NUMBER: 325800-268 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 201-994-1700 

TELEFAX : 201-994-1744 
INFORMATION FOR SEQ ID NO: 1: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 1209 BASE PAIRS 

7 TYPE: " NUCLEIC ACID 

STRANDEDNESS: SINGLE 

TOPOLOGY: LINEAR 
MOLECULE TYPE: cDNA 
PCT-US95-05616-1 



Query Match 94.0%; Score 1201; DB 5; Length 1209; 

Best Local Similarity 99.6%; Pred. No. 4.2e-270; 

Matches 1204; Conservative 0; Mismatches 5; Indels 0; Gaps 



Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTC CAGACT AT GAAGAT GAGTT T CT C C GCT AT CT GT G GC GT GAT TAT CT G 12 0 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 61 T CC CCT GT G C CT C CAGACT AT GAAGAT GAGTT T CTC C GCT AT CT GT G GC GT GAT TAT CT G 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCCCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I 

Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 42 0 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I 
Db 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCT^CCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 541 GCAGT CAT GGAAT GC AGCAGT GT GCT GCCT GAG CT AGC CAAC C GCACAC GGCT CTT CT CA 600 

Qy 601 GT o??GT GAT GAACGCTGGGCAGATGACCT CTAT CCCAAGATCTACCACAGTT GCrT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 GT CTGT CAT GAACGCT GGGCAGATGACCTCTAT CCCAAGATCTACCACAGTT GCTT CTTT 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 . 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 AACCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 



Qy 

Db 



781 
781 



840 



840 



Qy 841 CGCGCCTTCCTGGCT GAAGT GAAGCAGAT GC GT GC AC GGAGGAAGACAGC CAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 841 CGCGCCTTCCTGGCT GAAGT GAAGCAGAT GC GT GCAC GGAGGAAGACAGC CAAGAT GCT G 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

Qy 1201 TCCTTGCAG 1209 

I I I I I I II 
Db 1201 TCCTTGTAG 1209 



RESULT 6 
US-08-846-705-3 

; Sequence 3, Application US/08846705 

; Patent No. 5935814 

; GENERAL INFORMATION: 

APPLICANT: BERGSMA, DERK J. 
; APPLICANT: ELLIS, CATHERINE E 

TITLE OF INVENTION: NOVEL G-PROTEIN COUPLED 
NUMBER OF SEQUENCES: 5 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: RATNER & PRESTIA 
STREET: P.O. BOX 980 
: ; CITY: VALLEY FORGE 

STATE: PA 
COUNTRY: USA 
ZIP: 19482 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for. Windows Version 2.0 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/846,705 
FILING DATE: 30-APR-1997 



CLASSIFICATION : 435 
; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 
; NAME: PRESTIA, PAUL F 

REGISTRATION NUMBER: 23,031 

REFERENCE/ DOCKET NUMBER: GH-70003 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 610-407-0700 

TELEFAX: 610-407-0701 

TELEX: 846169 
; INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1133 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
US-08-846-705-3 



Query Match 85.0%; 
Best Local Similarity 99.9%; 
Matches 1087; Conservative 



Score 1086.4; DB 2; 
Pred. No. 1.8e-243; 
0; Mismatches 1; 



Length 1133; 
Indels 0; Gaps 



QY 
Db 



0; 



ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 



Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTGTCCGCTATCTGTGGCGTGATTATCTG 120 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 T CC C CT GT GC CTC CAGACTAT GAAGAT GAGT TT CT CC GCT AT CT GT G GCGT GATT AT CT G 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 



Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 2 40 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 



Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 GCC CT G GAC C GCT GGT AT GC CAT CTGC CAC C CACT ATT GTT CAAGAGCAC AGC C C GG C GG 4 80 



Qy 

Db 



481 
481 



540 
540 



Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

Qy 601 GT C T GT GAT GAAC G CT GGGC AGAT GAC CT CT AT C C C AAGAT CT AC C ACAGT T G CTT CT T T 660 

I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 GTCTGTGATGAACGCTGGGCAGATGAGCTCTATCCCAAGATCTAGCACAGTTGCTTCTTT 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCT GAAGT GAAGCAGATGCGT GCAC GGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 

Qy 1081 CTCAGTGG 1088 

I I I I I I I I 
Db 1081 CTCAGTGG 1088 



RESULT 7 • 
US-08-846-705-1 

; Sequence 1, Application US/08846705 

; Patent No. 5935814 

; GENERAL INFORMATION: 

; APPLICANT: BERGSMA, DERK J. 

; APPLICANT: ELLIS, CATHERINE E 

TITLE OF INVENTION: NOVEL G- PROTEIN COUPLED 
; NUMBER OF SEQUENCES: 5 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: RATNER & PRESTIA 

STREET: P.O. BOX 980 



CITY: VALLEY FORGE 
; STATE: PA 

; COUNTRY: USA 

; ZIP : 19482 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 

COMPUTER: IBM Compatible 

OPERATING SYSTEM: DOS 
; SOFTWARE: FastSEQ for Windows Version 2.0 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/846, 705 

FILING DATE: 30-APR-1997 
; CLASSIFICATION: 435 

; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: 

FILING DATE: 
ATTORNEY/AGENT INFORMATION: 

NAME: PRESTIA, PAUL F 

REGISTRATION NUMBER: 23,031 

REFERENCE/ DOCKET NUMBER: GH-70003 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 610-407-0700 

TELEFAX: 610-407-0701 

TELEX: 84 6169 

INFORMATION FOR SEQ ID NO: 1: . 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1170 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
US-08-846-705-1 

Query Match 85.0%; Score 1086.4; DB 2; Length 1170; 

Best Local Similarity 99.9%; Pred. No. 1.8e-243; 

Matches 1087; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

Qy 121,- TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 CT GGT GGGCAACACGCT GGTCTGC CTGGCCGT GT GGCGGAACCAC CACAT GAGGACAGT C 240 



Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 ACCAACT ACT T CAT T GT CAAC CT GT CC CT G GCTGAC GT T CT GGT GACT GCT AT CT GCCT G 300 



301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

421 GC C CT G GAC C GCT GGT ATGC CAT CT GC CAC C CACT ATT GT T CAAGAGCACAGC C CGGC GG 480 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

601 GT CT GT GAT GAAC GCT GGGCAGAT GAC CTC T AT C CCAAGAT CT AC CACAGT T GCTTCTTT 660 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

841 C GC GCCTT CCT GGCT GAAGT GAAGCAGAT GC GT GCAC GGAGGAAGACAGC CAAGAT GCT G 900 

I II I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I II I I I I I I I I I I I I I I I I I I I I I I I 
901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 mi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

1081 CTCAGTGG 1088 

I I I II I I I 

1081 CTCAGTGG 1088 



RESULT 8 

US-08-462-509B-3 

; Sequence 3, Application US/08462509B 

; Patent No. 6410701 

; GENERAL INFORMATION: 

APPLICANT: Soppet, Daniel et al 

TITLE OF INVENTION: Human Neuropeptide Receptor 
NUMBER OF SEQUENCES: 12 
; CORRESPONDENCE ADDRESS : 

; ADDRESSEE: Human Genome Sciences, Inc. 

; STREET: 9410 Key West Avenue 

CITY: Rockiville 

STATE : MD 

COUNTRY: USA 

ZIP: 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/462, 509B 

FILING DATE: 05-JUN-1995 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/US95/ 05616 

FILING DATE: 05-MAY-1995 
; ATTORNEY/AGENT INFORMATION: 
; NAME: Wales, Michele M. 

; REGISTRATION NUMBER: 43,975 

REFERENCE/ DOCKET NUMBER: PF168P1 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 301-309-8504 

TELEFAX: 301-309-8439 
; INFORMATION FOR SEQ ID NO: 3: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1110 base pairs 

TYPE: nucleic acid 
; STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
FEATURE : 

NAME/ KEY: CDS 

LOCATION: 1..1110 
US-08-462-509B-3 

Query Match 85.0%; Score 1085.8; DB 4; Length 1110; 

Best Local Similarity 99.8%; Pred. No. 2.4e-243; 

Matches 1087; Conservative 0; Mismatches 2; Indels 0; Gaps 0 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 T CCCCT GT GCCT CCAGACTAT GAAGAT GAGTTT CT CCGCTATCT GT GGCGTGATT AT CTG 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 



121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

361 GT CAT C CC CT AT CT ACAGGCTGT GT C CGT GT C AGT GGCAGT GCTAACTCT CAGCTT CAT C 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCC'CCAGGCT 54 0 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

601 GT CT GT GAT GAAC GCT GGGC AGAT GAC CT CT AT C C CAAGAT CT AC C AC AGT T GCTT CTT T 660 

I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

601 GT CT GT GAT GAAC GCT GGGCAGAT GACCT CT AT CC CAAGAT CT AC CACAGTT GCTT CTT T 660 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

7 81 CCCiTCAGACCAGCT GGGGGACCTGGAGCAGGGCCT GAGTGGAGAGCCCCAGCCGTcSSGGCC 84 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

841 CGCGCCTTC CT GGCT GAAGT GAAGCAGAT GCGT GCAC GGAGGAAGACAGC CAAGAT GCT G 900 

I I i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 

841 CGCGCCTTC CTGGCT GAAGTGAAGCAGATGCGT GCACGGAGGAAGACAGCCAAGAT GCT G 900 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTCCCCATCAGCGTCCTCAATGTCCTT 960 



961 




Db 



961 



Qy 



1021 




Db 



1021 



Qy 



1081 



CTCAGTGGC 1089 
I I I I I I I I I 
CTCAGTGGC 1089 



Db 



1081 



RESULT 9 

US-08-462-509B-5 

; Sequence 5, Application US/08462509B 

; Patent No. 6410701 

;. GENERAL INFORMATION: 

APPLICANT: Soppet, Daniel et al 

TITLE OF INVENTION: Human Neuropeptide Receptor 
; NUMBER OF SEQUENCES: 12 

CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Human Genome Sciences, Inc. 

; STREET: 9410 Key West Avenue 

; CITY: Rockiville 

STATE: MD 
COUNTRY: USA 
ZIP: 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
^CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/462 , 509B 
FILING DATE: 05-JUN-1995 
CLASSIFICATION: 435 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: WO PCT/US95/05616 
; FILING DATE: 05-MAY-1995 

; ATTORNEY/AGENT INFORMATION: 

' NAME: Wales, Michele M. 
; REGISTRATION NUMBER: 43,975 

REFERENCE/ DOCKET NUMBER: PF168P1 
TELECOMMUNICATION INFORMATION: 
; ... TELEPHONE : 301-309-8504 • 

; TELEFAX: 301-309-8439 

; INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1116 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
FEATURE : 

NAME/ KEY: CDS 
LOCATION: 1..1116 



US-08-462-509B-5 



Query Match 84.8%; Score 1083.2; DB 4; Length 1116; 

Best Local Similarity 99.7%; Pred. No. 9.7e-243; 

Matches 1085; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 



Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGACCCC 60 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I . 
Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I 
Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 42 0 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 GT CAT CCC CT AT CT ACAG GCT GT GT C CGT GT C AGT G GC AGT G CTAACT CT CAGCT T CAT C 420 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II II I I I I I I I 
Db 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GC AGT CAT G GAATG C AGC AGT GT GCT GC CT GAGCT AGCCAAC C GCACAC GGCT CTT CT CA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

Qy 601 GT CTGTGAT GAACGCTGGGCAGATGACCT CTAT CCCAAGATCTACCACAGTT GCTT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 GT CTGT GAT GAACGCTGGGCAGATGACCT CTAT CCCAAGAT CTACCACAGTTGCTT CTTT 660 

Qy 661 AT T GT CAC CT AC CT GGCC C CACT GGGCCT CAT GGC CAT GGC CTAT T TC CAGAT AT T C C GC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db" 661 AT T GT CAC CT AC CT GGC CC CACT GGGCCT CAT GGC CAT GGC CT AT TTC C AGAT AT T CCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

! I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 



Qy 

Db 



7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I 
7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 



840 
840 



Qy 841 CGCGCCTT CCTGGCT GAAGTGAAGCAGAT GCGTGCACGGAGGAAGACAGCCAAGAT GCT G 900 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 841 CGCGCCTT CCT GGCT GAAGTGAAGCAGAT GCGTGCACGGAGGAAGACAGCCAAGAT GCTG 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

* I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 102 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGG 1088 

I I I I I I I I 
Db 1081 CTCAGTGG 1088 



RESULT 10 
PCT-US95-05616-5 

; Sequence 5, Application PC/TUS9505616 
; GENERAL INFORMATION: 

APPLICANT: LI, ET AL. 

TITLE OF INVENTION: Human Neuropeptide Receptor 
NUMBER OF SEQUENCES: 12 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: CARE L LA, BYRNE, BAIN, GILFILLAN, 
; ADDRESSEE: CECCHI, STEWART & OLSTEIN 

; STREET: 6 BECKER FARM ROAD 

; CITY: ROSELAND 

; STATE: NEW JERSEY 

; COUNTRY: USA 

; ZIP: 07068 

COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5 INCH DISKETTE 
COMPUTER: IBM PS/2 
OPERATING SYSTEM: MS-DOS 
; SOFTWARE: WORD PERFECT : 5. 1 

CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: PCT/US95/05616 

FILING DATE: concurrently 
CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 
NAME: FERRARO, GREGORY D. 
REGISTRATION NUMBER: 36,134 
REFERENCE/DOCKET NUMBER: 325800-268 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 201-994-1700 
TELEFAX: 201-994-1744 



INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 1133 BASE PAIRS 
TYPE: NUCLEIC ACID 
STRANDEDNESS : SINGLE 
TOPOLOGY: LINEAR 
MOLECULE TYPE: cDNA 
PCT-US95-05616-5 

Query Match 84.8%; Score 1083.2; DB .5; Length 1133; 

Best Local Similarity 99.7%; Pred. No. 9.8e-243; 

Matches 1085; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGACCCC 60 

Qy 61 T C C C CT GT G CCT C C AGACT AT GAAGAT GAGT TT C T C C GCT AT CT GT GGC GT GAT TAT CT G 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db .61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

Qy 121 T AC C C AAAAC AGT AT GAGT GGGT CCT CAT C GC AG C CT AT GT GGCT GT GT T C GT C GT GGC C 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 42 0 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

Db 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 8 0 

Qy 481 r i2CCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGcCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II 
Db 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

Qy 541 GCAGT CAT G GAAT GC AGC AGT GT GCT GCCT GAGCT AGCCAAC C GCACAC GGCT CTT CT C A 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 



Qy 
Db 



601 GT CTGT GAT GAACGCT GGGCAGATGACCTCTAT CCCAAGATCTACCACAGTT GCTT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

601 GT CTGT GAT GAACGCT GGGCAGATGACCT CTATCCCAAGATCTACCACAGTT GCTT CTTT 660 



Qy 

Db 



661 
661 



ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCGAGATATTCCGC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 



720 
720 



Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I II I I I I I I I I II I I I I I I I I I I I I I! I I I I I I I I I I 1 I I I I I I I I I I I I II I I I I I I I 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

Qy 781 C C CT C AGAC C AG CT GG GG GAC CT GGAGCAG GG C CT GAGT G GAGAGC C C CAGC C CCGG GC C 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCT GAAGT GAAG CAGAT G C GT GCAC GGAGGAAGAC AG C C AAGAT G CT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 CGCGCCTTCCTGGCT GAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I 1 1 1 1 1 I 1 1 I 1 1 I 1 1 I 1 1 1 1 1 1 1 I 1 1 I 1 1 1 1 1 1 1 i I I I 1 1 1 1 1 1 1 1 1 1 I I I 1 1 1 I I I 1 1 I 

Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 

Qy 1081 CTCAGTGG 1088 

II I I I I I I 

Db 1081 CTCAGTGG 108 8 



RESULT 11 
PCT-US95-05616-3 

; Sequence 3, Application PC/TUS9505616 

; GENERAL INFORMATION:. 

; - APPLICANT: LI, ET AL. 

; TITLE OF INVENTION: Human Neuropeptide Receptor 
NUMBER OF SEQUENCES: 12 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: CARELLA, BYRNE, BAIN, GILFILLAN, 
ADDRESSEE: CECCHI, STEWART & OLSTEIN 
STREET: 6 BECKER FARM ROAD 
CITY: ROSELAND 
; . -V> STATE: NEW JERSEY 
COUNTRY: USA 
ZIP : 07068 
COMPUTER READABLE FORM: 

MEDIUM TYPE: 3.5 INCH DISKETTE 
COMPUTER: IBM PS/2 
OPERATING SYSTEM: MS-DOS 
SOFTWARE:" WORD PERFECT 5.1 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US95/05616 
; FILING DATE: concurrently 

; CLASSIFICATION: 



; ATTORNEY/ AGENT INFORMATION: 

NAME: FERRARO , GREGORY D. 
; REGISTRATION NUMBER: 36,134 

; REFERENCE/DOCKET NUMBER: 325800-268 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 201-994-1700 
; TELEFAX: 201-994-1744 

; INFORMATION FOR SEQ ID NO: 3: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 1110 BASE PAIRS 

TYPE: NUCLEIC ACID 

STRANDEDNESS: SINGLE 

TOPOLOGY: LINEAR 
MOLECULE TYPE: cDNA 
PCT-US95-05616-3 

Query Match 84.3%; Score 1077.8; DB 5; Length 1110; 

Best Local Similarity 99.4%; Pred. No. 1.7e-241; 

Matches 1082; Conservative 0; Mismatches 7; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I III I I I I I I I I I I I I I I hi I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 61 T CC C CT GT G C CT C C AGACTAT GAAGAT GAGTTT CT CC GCT AT CT GT GGCGT GATTAT CT G 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 T AC C CAAAACAGT AT GAGT GGGT C CT CAT C GCAGC CT AT GT GGCT GT GTT C GT C GT GGCC 180 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 301 CCGGCCAGCCTGCTGGTGGACATCACT GAGT CCTGGCT GTT CGGCCATGCCCTCTGCAAG 360 

Qy 361 GT CAT C C CCT ATCT ACAGGCT GT GT CC GT GT CAGT G GCAGT GCT AACT CT CAGCT T CAT C 420 

I I I I I I I I I I I I I I I I I I I I I I I J Mil I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db - 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I 

Db 421 CCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

Qy ~ 481 GC CC GT GGCT CCAT C CT GGGCAT CT GG GCT GT GT CGCT GGC CAT CAT GGT GC C C CAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 



Db 



541 



600 



Qy 601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 CT CT GT CAT GAAC GCT GG GCAGAT GACCT CTAT C CCAAGAT CT ACC ACAGT TGCTTCTTT 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 C GC GC CT T C CT GG CT GAAGT GAAG CAGAT GC GT GCAC GGAGGAAGACAGC CAAGAT GCTG 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II II II I I I I I I I I I I I II I I I I I I I 
Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTCCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 102 0 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 

Qy 1081 CTCAGTGGC 1089 

I I I I I I I I I 
Db 1081 CTCAGTGGC 1089 



RESULT 12 

US-08-513-974B-375 

Sequence 375, Application US/08513974B 
Patent No. 6114139 
GENERAL INFORMATION: 

APPLICANT : Hinuma, ...Sh'uj i 
APPLICANT: Hosoya, Masaki 
APPLICANT: Fujii, Ryo 
APPLICANT: Ohtaki, Tetsuya 
APPLICANT: Fukusumi, Shoji 
APPLICANT: Ohgi, Kazuhiro 

TITLE OF INVENTION: G PROTEIN COUPLED RECEPTOR PROTEIN, 
TITLE OF INVENTION: PRODUCTION, AND USE THEREOF 
NUMBER OF SEQUENCES: 380 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: DIKE, BRONSTEIN, ROBERTS & CUSHMAN, LLP 
STREET: 130 Water Street 



CITY: Boston 
STATE : MA 
COUNTRY: USA 
ZIP: 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/513, 974B 

FILING DATE: 14-SEP-1995 

CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/ JP95/01599 

FILING DATE: 10-AUG-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 7-093989 

FILING DATE: 19-AUG-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 7-057186 

FILING DATE: 16-MAR-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 7-007177 

FILING DATE: 20-JAN-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-32 6611 

FILING DATE: 28-DEC-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-270017 

FILING DATE: 02-NOV-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-236357 

FILING DATE: 30-SEP-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-236356 

FILING* DATE: 30-SEP-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-189274 

FILING DATE: ll-AUG-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-189273 

FILING DATE: ll-AUG-1945 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-189272 

FILING DATE: ll-AUG-1994 
ATTORNEY/AGENT INFORMATION: 

NAME: Resnick, David S. 

REGISTRATION NUMBER: 34,235 

REFERENCE/ DOCKET NUMBER: 45753 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-523-3400 

TELEFAX: 617-523-6440" 
INFORMATION FOR SEQ ID NO: 375: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 843 base pairs 

TYPE: nucleic acid 



STRANDEDNESS : double 
.TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

FEATURE : 

NAME/ KEY: CDS 
LOCATION: 28.. 816 
US-08-513-974B-3.75 



Query Match 54.7%; Score 699.2; DB 3; Length 843; 

Best Local Similarity 90.0%; Pred. No. 1.6e-153; 

Matches 74 9; Conservative 0; Mismatches 83; Indels 0; Gaps 0; 

Qy 252 CATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCT 311 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 9 CGTGTTCATCCTGTCACTGGCCGATGTGCTGGTGACAGCCATCTGCCTGCCGGCCAGTCT 68 

Qy 312 GCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTA 371 

I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 69 GCTGGTAGACATCACGGAATCCTGGCTCTTTGGCCATGCCCTCTGCAAGGTCATCCCCTA 128 

Qy 372 TCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCG 431 

MINIMI I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 129 TCTACAGGCCGTGTCCGTGTCAGTGGTCGTGCTGACTCTCAGCTCCATCGCCCTGGACCG 188 

Qy 432 CTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTC 491 

MINI I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II II I II I I 

Db 189 CTGGTACGCCATCTGCCACCCGCTGTTGTTCAAGAGCACTGCCCGGCGCGCCCGCGGCTC 248 

Qy 492 CATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGA 551 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I II I I I I MINIM 
Db 249 CATCCTCGGCATCTGGGCGGTGTCGCTGGCTGTCATGGTGCCTCAGGCTGCTGTCATGGA 308 



Qy 552 ATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGA 611 

II I I I I I I I I I I I I I I II I I I I I I I I I I I II I I III I II I I I I I I I I I I I 
Db 309 GTGTAGCAGCGTGCTGCCCGAGCTGGCCAACCGCACCCGCCTCCTGTCTGTCTGTGATGA 368 

Qy 612 ACGCT GGGCAGAT GAC CT CT ATC C CAAGAT CT AC CACAGTT GCT T CT TTAT TGT C AC CTA 671 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 369 GCGCTGGGCAGACGACCTGTACCCCAAGATCTACCACAGCTGCTTCTTCATTGTCACCTA 428 

Qy 672 CCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGG 731 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 429 CCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATCTTCCGCAAGCTCTGGGG 488 

Qy 732 C C GC C AGAT C C C C GG C AC CAC CT C AG C ACT G GT GC G GAACT GGAAGC GCC C CT C AGAC C A 791 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I r • I I I I I I I I I I I I I I I I I I I I I I 

Db 489 CCGCCAGATCCCCGGCACCACCTCGGCCCTGGTGCGCAACTGGAAGCGGCCCTCAGACCA 548 

Qy 792 GCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCT 851 

I I I I I MM II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 549 GCTGGACGACCAGGGCCAGGGCCTGAGCTCAGAGCCCCAGCCCCGGGCCCGCGCCTTCCT 608 

Qy 852 GGCT GAAGT GAAGCAGAT GC GTG CAC GGAGGAAGAC AGC CAAGAT GCT GAT GGT GGT GOT 911 

III II I I I II I I I I II I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 609 GGC CGAGGT GAAACAGAT G CGAG C CC GGAGGAAGAC GGC CAAGAT GCT GAT GGT GGT GCT 668 



Qy 



912 



GCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTT 971 



1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

Db 669 GCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGTGTCCTCAACGTCCTCAAGAGGGTCTT 72 8 

Qy 972 CGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCA 1031 

I I I I I I I I II I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 729 CGGGATGTTCCGCCAAGCCAGCGACCGAGAGGCCATCTACGCCTGCTTCACCTTCTCCCA 788 

Qy 1032 C T GGCT GGT GT AC G C C AAC AGC GCT GC C AAC C CCAT CAT CT ACAACT T C CT C 1083 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I III II I I I I I I I I I I I I 

Db 789 CTGGCTGGTGTACGCCAACAGCGCCGCCAATCCCCTCCTCTACTCCTTCCTC 840 



RESULT 13 
US-08-513-974B-55 

Sequence 55, Application US/08513974B 
Patent No. 6114139 
GENERAL INFORMATION: 

APPLICANT: Hinuma, Shuji 
APPLICANT: Hosoya, Masaki 
APPLICANT: Fujii, Ryo 
APPLICANT: Ohtaki, Tetsuya 
APPLICANT: Fukusumi, Shoji 
APPLICANT: Ohgi, Kazuhiro 

TITLE OF INVENTION: G PROTEIN COUPLED RECEPTOR PROTEIN, 
TITLE OF INVENTION: PRODUCTION, AND USE THEREOF 
NUMBER OF SEQUENCES: 380 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: DIKE, BRONSTEIN, ROBERTS & CUSHMAN, LLP 
STREET: 130 Water Street 
CITY: Boston 
STATE : MA 
COUNTRY: USA 
ZIP : 02109 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/513, 974B 
FILING DATE: 14-SEP-1995 
CLASSIFICATION: 536 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/JP95/ 01599 
FILING DATE: 10-AUG-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 7-093989 
FILING DATE: 19-AUG-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 7-057186 
FILING DATE: 16-MAR-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 7-007177 
FILING DATE: 20-JAN-1995 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-326611 
FILING DATE: 28-DEC-1994 



; PRIOR APPLICATION DATA: 

; APPLICATION NUMBER: JP- 6-270017 

FILING DATE: 02-NOV-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-236357 
FILING DATE: 30-SEP-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-236356 
FILING DATE: 30-SEP-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-189274 
FILING DATE: ll-AUG-1994 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: JP 6-189273 

FILING DATE: ll-AUG-1945 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: JP 6-189272 
FILING DATE: ll-AUG-1994 
ATTORNEY/AGENT INFORMATION: 
; NAME: Resnick, David S. 

REGISTRATION NUMBER: 34,235 
REFERENCE/ DOCKET NUMBER: 45753 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-523-3400 

; TELEFAX: 617-523-6440 

; INFORMATION FOR SEQ ID NO: 55: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 789 base pairs 

; TYPE: nucleic acid 

STRANDEDNESS: double 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
US-08-513-974B-55 

Query Match 52.6%; Score 672.2; DB 3; Length 789; 

Best Local Similarity 90.7%; Pred. No. 2.9e-147; 

Matches 716; Conservative 0; Mismatches 73; Indels 0; Gaps 0; 



Qy 271 GCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGGACATCACTGAG 330 

II II II I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II 
Db 1 GCCGATGTGCTGGTGACAGCCATCTGCCTGCCGGCCAGTCTGCTGGTAGACATCACGGAA 60 

Qy 331 TCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGGCTGTGTCCGTG 390 

I II I II I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TCCTGGCTCTTTGGCCATGCCCTCTGCTUVGGTCATCCCCTATCTACAGGCCGTGTCCGTG 120 

Qy 391 TCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATGCCATCTGCCAC 450 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TCAGTGGTCGTGCTGACTCTCAGCTCCATCGCCCTGGACCGCTGGTACGCCATCTGCCAC 180 

Qy 451 CCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGGGCATCTGGGCT 510 

II II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 181 CCGCTGTTGTTCAAGAGCACTGCCCGGCGCGCCCGCGGCTCCATCCTCGGCATCTGGGCG 240 

Qy 511 GTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCAGTGTGCTGCCT 570 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 241 GTGTCGCTGGCTGTCATGGTGCCTCAGGCTGCTGTCATGGAGTGTAGCAGCGTGCTGCCC 300 



Qy 571 GAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGGCAGATGACCTC 630 

Mill I I I I I I I I I I I II III I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 GAGCTGGCCAACCGCACCCGCCTCCTGTCTGTCTGTGATGAGCGCTGGGCAGACGACCTG 360 

Qy 631 TATCCC7\AGATCTACCACAGTTGCTTCTTTATTGTCACCTACCTGGCCCCACTGGGCCTC 690 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 361 TACCCCAAGATCTACCACAGCTGCTTCTTCATTGTCACCTACCTGGCCCCACTGGGCCTC 420 

Qy 691 ATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACC 750 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 ATGGCCATGGCCTATTTCCAGATCTTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACC 4 80 

Qy 751 ACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGGACCTGGAGCAG 810 

I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II III 

Db 481 ACCTCGGCCCTGGTGCGCAACTGGAAGCGGCCCTCAGACCAGCTGGACGACCAGGGCCAG 540 

Qy 811 GGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATG 870 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 541 GGCCTGAGCTCAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCCGAGGTGAAACAGATG 600 

Qy 871 CGTGCACGGAGGAAGACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGC 930 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 CGAGCCCGGAGGAAGACGGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGC 660 

Qy 931 TACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAGCC 990 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 TACCTGCCCATCAGTGTCCTGAACGTCCTCAAGAGGGTCTTCGGGATGTTCCGCCAAGCC 720 

Qy 991 AGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAAC 1050 

II Mill II II I I I I I I I I M I M II I I I I I I I I I I II I M I I I I M I II I M I I 

Db 721 AGCGACCGAGAGGCCATCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAAC 780 

Qy 1051 AGCGCTGCC 1059 

Mill Ml 
Db 781 AGCGCCGCC 789 



RESULT 14 
US-09-461-436B-55 

; Sequence 55, Application US/09461436B 
; Patent No. 6538107 

GENERAL INFORMATION: 

APPLICANT: Shuji Hinuma 
; Yasuaki Ito 

; Ryo Fujii 

; TITLE OF INVENTION: G Protein Coupled Receptor Protein, 

; Production, And Use Thereof 

NUMBER OF SEQUENCES: 61 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Edwards & Angell, LLP 
STREET: 101 Federal, Street 
; CITY: BOSTON 

; STATE: MA 

COUNTRY: USA 
ZIP: 02209 
COMPUTER READABLE FORM: 



MEDIUM TYPE: Floppy disk 
; - COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/461, 436B 

FILING DATE: 14-Dec-1999 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/513,974 

FILING DATE: 14-SEP-1995 

APPLICATION NUMBER: PCT/ JP95/01599 

FILING DATE: 10-AUG-1995 

APPLICATION NUMBER: 7-093989 

FILING DATE: 19-APR-1995 

APPLICATION NUMBER: 7-057186 

FILING DATE: 16-MAR-1995 

APPLICATION NUMBER: 7-007177 

FILING DATE: 20-JAN-1995 

APPLICATION NUMBER: 6-326611 

FILING DATE: 28-DEC-1994 

APPLICATION NUMBER: 6-270017 
; FILING DATE: 02-NOV-1994 

APPLICATION NUMBER: 6-236357 
; FILING DATE: 30-SEP-1994 

APPLICATION NUMBER: 6-236356 

FILING DATE: 30-SEP-1994 
; APPLICATION NUMBER: 6-189274 

; FILING DATE: ll-AUG-1994 

APPLICATION NUMBER: 6-189273 

FILING DATE: ll-AUG-1994 

APPLICATION NUMBER: 6-189272 

FILING DATE: ll-AUG-1994 
ATTORNEY/ AGENT INFORMATION: 

NAME: CONLIN, DAVID G. 

REGISTRATION NUMBER: <Unknown> 

REFERENCE/ DOCKET NUMBER: 45753 DIV2 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 617-439-4444 

TELEFAX: 617-439-4170 
INFORMATION FOR SEQ ID NO: 55: 
SEQUENCE CHARACTERISTICS: 

LENGTH: -789 base pairs 
; TYPE: nucleic acid 

; STRANDEDNESS: double 

TOPOLOGY: linear 
; MOLECULE TYPE: cDNA 

SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
US-09-461-436B-55 

Query Match 52.6%; Score 672.2; DB 4; Length 789; 

Best Local Similarity 90.7%; Pred. No. 2.9e-147; 

Matches 716; Conservative 0; Mismatches 73; Indels 0; Gaps 0; 



Qy 

Db 



271 GCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGGACATCACTGAG 330 
II II II I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II 
1 GCCGATGTGCTGGTGACAGCCATCTGCCTGCCGGCCAGTCTGCTGGTAGACATCACGGAA 60 



Qy 331 TCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGGCTGTGTCCGTG 390 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 TCCTGGCTCTTTGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGGCCGT GTCCGTG 120 



Qy 391 TCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATGCCATCTGCCAC 450 

I I I I I I I I 1 I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TCAGTGGTCGTGCTGACTCTCAGCTCCATCGCCCTGGACCGCTGGTACGCCATCTGCCAC 180 

Qy 451 CCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGGGCATCTGGGCT 510 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 CCGCTGTTGTTCAAGAGCACTGCCCGGCGCGCCCGCGGCTCCATCCTCGGCATCTGGGCG 240 

Qy 511 GTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCAGTGTGCTGCCT 570 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 241 GTGTCGCTGGCTGTCATGGTGCCTCAGGCTGCTGTCATGGAGTGTAGCAGCGTGCTGCCC 300 



Qy 571 GAGC T AGC CAACC GCACACGGCT CT T CT CAGT CT GTGAT GAACG CT GGGCAGAT GACCT C 630 

Mill I I I I I I I I I I I II III I II I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 301 GAGCTGGCCAACCGCACCCGCCTCCTGTCTGTCTGTGATGAGCGCTGGGCAGACGACCTG 360 

Qy 631 TATCCCAAGATCTACCACAGTTGCTTCTTTATTGTCACCTACCTGGCCCCACTGGGCCTC 690 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 TACCCCAAGATCTACCACAGCTGCTTCTTCATTGTCACCTACCTGGCCCCACTGGGCCTC 420 

Qy 691 ATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACC 750 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 ATGGCCATGGCCTATTTCCAGATCTTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACC 4 80 



Qy 751 ACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGGACCTGGAGCAG 810 

I I I I I II I I I I I I I I I I I I I I I I I I I' I I I I I I I I I I I I I I I I I I I I II III 
Db 481 ACCTCGGCCCTGGTGCGCAACTGGAAGCGGCCCTCAGACCAGCTGGACGACCAGGGCCAG 540 

Qy 811 GGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATG 870 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 541 GGCCTGAGCTCAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCCGAGGTGAAACAGATG 600 

Qy 871 CGTGCACGGAGGAAGACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGC 930 

II II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 601 CGAGCCCGGAGGAAGACGGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGC 660 

Qy 931 T ACC TGC C CAT CAGC GT C CT CAAT GT C CT TAAGAGGGT GTT C GGGAT GT T C C GCCAAGC C 990 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 TACCTGCCCATCAGTGTCCTCAACGTCCTCAAGAGGGTCTTCGGGATGTTCCGCCAAGCC 720 



Qy 991 AGTGACCGCGAAGCTGT.CtACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAAC 1050 

II I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 AGCGACCGAGAGGCCATCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAAC 780 

Qy 1051 AGCGCTGCC 1059 

I I I I I I I I 
Db 781 AGCGCCGCC 789 



RESULT 15 
US-09-119-788-1 

; Sequence 1, Application US/09119788 



; Patent No. 6166193 
; GENERAL INFORMATION: 

APPLICANT: Yanagisawa, Masashi 
; TITLE OF INVENTION: CDNA CLONE MY1 THAT ENCODES 

TITLE OF INVENTION: A NOVEL HUMAN 7 -TRANSMEMBRANE RECEPTOR 
NUMBER OF SEQUENCES: 2 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SmithKline Beecham Corporation 
STREET: 709 Swedeland Road 
; CITY: King of Prussia 

STATE: PA 

COUNTRY: United States of America 
ZIP: 19406 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
; COMPUTER: IBM Compatible 

. ; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ for Windows Version 2.0 
; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/09/119, 788 

FILING DATE: 21-JUL-1998 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 60/053,790 
FILING DATE: 25-JUL-1997 
ATTORNEY/AGENT INFORMATION: 
NAME: King, William T 
REGISTRATION NUMBER: 30,954 
REFERENCE/DOCKET NUMBER: GH50029 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 610-270-5515 
TELEFAX: 610-270-5090 
; TELEX : 

; INFORMATION FOR SEQ ID NO: 1: 

SEQUENCE CHARACTERISTICS: 
; LENGTH: 1633 base pairs 

TYPE: nucleic acid 
; STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: Genomic DNA 
■ US-09-119-788-1 

Query Match 43.4%; Score 554.4; DB 3; Length 1633; 

Best Local Similarity 68.2%; Pred. No. 8.3e-120; 

Matches 819; Conservative 0; Mismatches 366; Indels 15; Gaps 3; 



Qy 80 AT GAAGAT GAGT T T CT CC GCT AT CT GT GG C GT GAT T AT CT GT ACC CAAAAC AGT AT GAGT 139 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 217 ACGAC GAGGAATT CCT GCGGT ACCT GTGGAGGGAATACCTGCACCCGAAAGAATAT GAGT 276 

Qy 140 GGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCCCTGGTGGGCAACACGCTGG 199 

M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 277 GGGTCCTGATCGCCGGGTACATCATCGTGTTCGTCGTGGCTCTCATTGGGAACGTCCTGG 336 

Qy 200 TCTGC CTGGCCGTGT GGCGGAACCACCACATGAGGACAGTCACCAACTACTT CATT GTCA 259 

I II I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 337 T T TGT GT GG CAGT GT GGAAGAAC CAC C ACAT GAGGAC GGTAAC CAACT ACTT CAT AGT CA 396 



260 ACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGG 319 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

397 ATCTTTCTCTGGCTGATGTGCTCGTGACCATCACCTGCCTTCCAGCCACACTGGTCGTGG 4 56 

320 ACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGG 379 

I I I I I I I I I I Mill I II II II I II I I I I I I II II II I I I I I I I I I 

457 AT AT CACT GAGACCT GGT TT TTT GGACAGT CC CTT T G CAAAGT GATT CCT TAT CTAC AGA 516 

380 CTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATG 439 

I I I I I I I I I I I III I II II I I II I I I I I I I I I I MM II I II II II 
517 CCGTGTCGGTGTCTGTGTCTGTCCTCACACTGAGCTGTATCGCCTTGGATCGGTGGTATG 576 

440 CCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGG 4 99 

I M I II I II II I I I I I I II I I I II II I II II II II I I I III I 
577 CAATCTGTCACCCTTTGATGTTTAAGAGCACAGCAAAGCGGGCCCGTAACAGCATTGTCA 636 

500 GCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCA 559 

I II II I I II I I I I I II I I II II I I I II I II I I I I I I I II I 

637 TCATCTGGATTGTCT C CTGCATTATAAT GATT CCT CAGGCCAT CGTCATGGAGTGCAGCA 696 

560 GTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGG 619 

I II I I I I I I I I I I I II I II II I II I I I I I I I I II I II II 

697 CCGTGTTCCCAGGCTTAGCCAATAAAACCACCCTCTTTACGGTGTGTGATGAGCGCTGGG 756 

620 CAGAT GAC CT CT AT C C CAAGATCTAC C ACAGT T GCT T CT T TAT T GT CAC C T AC CT GGC C C 679 

II I II I I I II I I I I II I I I I M I II I I II II I I I I I II I 

757 GT GGT GAAATTT AT C C CAAGAT GT ACC ACAT CT GT T T CT TTCT GGT GAC ATACAT GG CAC 816 

680 CACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGA 739 

Mill I I I II I II II I I III I II I II II I II II I I II I I I II I I I I 
817 CACTGTGTCTCATGGTGTTGGCTTATCTGCAAATATTTCGCAAACTCTGGTGTCGACAGA 876 

74 0 TCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGG 799 

I I I I I I I I I II I I I I I I I I I I II II I I I II I I 

877 T CCCT GGAACAT CATCTGTAGTT CAGAGAAAATGGAAGCCCC TGCAGCCTGTTT 930 

800 ACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAG 859 

II I III I I I I II I I I MM I I II I I M 

931 CACAGCCTCGAGGGCCAGGACAGCCAACGAAGTCCCGGATGGGCGCTGTGGCGGCTGAAA 990 

8 60 TGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCT GAT GGTGGTGCTGCT GGT CT 919 

I I I I I II II I I II I II I II I I I II I II I I I II I I II I I I II I I II I 
991 TAAAGCAGATCCGAGCCAGAAGGAAAACAGCCCGGATGTT GAT GGTT GTGCTTTT GGT AT 1050 

920 TCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT'AAGAGGGTGTTCGGGATGT 979 
I M I I M I I II II II III II I I II II I I II II I I I II II I I I I I I I 
1051 TTGCAATTT GCTATCT ACCAATTAGCAT CCTCAAT GTGCTAAAGAGAGTATTTGGGAT GT 1110 

980 TCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGG 1039 
I I I I I I I I I II I I I I I I I M I I I I M I I I I I I I I I I I I I I 

1111 TTGC CCATACT GAAGACAGAGAGACTGT GTAT GCCT GGTTTACCTTTT CACACT GGCTTG 1170 

104 0 TGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTCCTCAGTGGCAAATTCCGGG 1099 

I II Mill II II I I I II II M II II II II I I II II II I II II II I 
1171 TAT AT GC CAAT AGTGCT GC GAAT C CAATT ATT TATAAT T T TCT CAGT G GAAAAT T T CGAG 1230 



Qy 1100 AGCAGTTTAAGGCTGCCTTCTC CTGCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTC 1156 

I I I I I I I I I I I I I I I I I ! M I I I I I I I III I II 

Db 1231 AGGAATTTAAAGCTGCGTTTTCTTGCTGTTGCCTTGGAGTTCACCATCGCCAGGAGGATC 1290 

Qy 1157 TGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGAT 1216 

I II II III I I I I I I I I I I I I I I I I I I I I 

Db 1291 GGCT CAC CAGGGGAC GAACTAGCACAGAGAGCCGGAAGT CCTTGACCACTCAAAT CAGCA 1350 

Qy 1217 GCT C CGT CT CCAAAAT CTCTGAGCAT GT GGT GCT CACCAGCGT CACCACAGTGC 127 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1351 ACTT T GATAAC AT AT CAAAACT TT CT GAGC AAGT T GT GCT CACT AGCAT AAGCACACT C C 1410 



Search completed: October 15, 2004, 22:55:05 
Job time : 103.178 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on : 



October 15, 2004, 19:59:43 ; Search time 659.145 Seconds 

(without alignments) 
9829.265 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-10-070-532-1 
1278 

1 atggagccctcagccacccc. 



. tcaccacagtgctgccctga 1278 



Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 3340653 seqs, 2534783454 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



6681306 



Database 



Publ ished_Appli cation s_NA : * 

1: /cgn2_6/ptodata/l/pubpna/US07_PUBCOMB.seq: * 

2: • /cgn2_6/ptodata/l/pubpna/PCT_NEW_PUB.seq:* 

3: /cgn2_6/ptodata/l/pubpna/US06_ 1 NEW_PUB.seq:* 

4: /cgn2_6/ptodata/l/pubpna/US06__PUBCOMB.seq: * 

5: /cgn2_6/ptodata/l/pubpna/US07_NEW_PUB.seq:* 

6: /cgn2__6/ptodata/l/pubpna/PCTUS_PUBCOMB. seq: * 

7: /cgn2_6/ptodata/l/pubpna/US08_NEW_PUB.seq:* 

8 : /cgn2_6/ptodata/l/pubpna/US08_PUBCOMB. seq: * 

9: /cgn2_6/ptodata/l/pubpna/US09A_PUBCOMB.seq: * 
10: /cgn2_6/ptodata/l/pubpna/US09B_PUBCOMB. seq:* 
11: /cgn2_6/ptodata/l/pubpna/US09C_PUBCOMB.seq:* 
12 : /cgn2_6/ptodata/l/pubpna/US09_NEW_PUB.seq:* 
13 : /cgn2_6/ptodata/l/pubpna/US09_NEW_PUB . seq2 : * 
14 : /cgn2_6/ptodata/l/pubpna/US10A_PUBCOMB.seq:* 
15: /cgn2_6/ptodata/l/pubpna/US10B_PUBCOMB. seq: * 
16: /cgn2_6/ptodata/l/pubpr/a/US10C_PUBCOMB.seq:* 
17: /cgn2_6/ptodata/l/pubpna/US10_NEW_PUB.seq:* 
18 : /cgn2_6/ptodata/l/pubpna/US60_NEW_PUB. seq: * 
19 : /cgn2_6/ptodata/l/pubpna/US60_PUBCOMB. seq: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 
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RESULT 1 

US-09-828-538-23 

; Sequence 23, Application US/09828538 
; Patent No. US20010025031A1 
; GENERAL INFORMATION: 



APPLICANT: Ellis r Catherine E. 
APPLICANT: Kwok, Cheni 
APPLICANT: Bodsworth, Nicola J. 
APPLICANT: Halsey, Wendy 
APPLICANT: Van Horn, Stephanie 

TITLE OF INVENTION: HFGAN72 Receptor. Genomic DNA and Methods 
TITLE OF INVENTION: of Use Thereof in Diagnostic Applications 
FILE REFERENCE: GH-50038-C1 
CURRENT APPLICATION NUMBER: US/09/828, 538 
CURRENT FILING DATE: 2001-04-06 
PRIOR APPLICATION NUMBER: 60/088,624 
PRIOR FILING DATE: 1998-06-08 
PRIOR APPLICATION NUMBER: 60/093,726 
PRIOR FILING DATE: 1998-07-22 
PRIOR APPLICATION NUMBER: 09/328,014 
PRIOR FILING DATE: 1999-06-08 
NUMBER OF SEQ ID NOS : 24 

SOFTWARE: FastSEQ for Windows Version 3.0 
SEQ ID NO 23 
LENGTH: 1564 
TYPE: DNA 

ORGANISM: HOMO SAPIENS 
US-09-828-538-23 

Query Match 99.7%; Score 1274.8; DB 9; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 0; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I | | I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I 

Db 214 T CC CCT GT GC CT C CAGACT AT GAAGAT GAGTT T CT C C GCT AT CT GT GGC GC GAT TAT CT G 273 

Qy 121' T AC C CAAAACAGT AT GAGT GGGT C CTCAT C GCAG CCT AT GT GGCT GT GTT C GT C GT GGCC 180 

I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

i i i i i i i i i i i i i i i i i i i i i i i i i I i i i i i i i i i i i i i i i i i i i i i i i ri i i i i i i i i i 

Db 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy ' 2 41 ACC7UVCTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

II irl I I II I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I Mi I III 

Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

| I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 



Qy 



421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db • 


574 


c;rrrTGnArrGrTc;GTAT(^rrATrTnrrArrrArTATTGTTCAAGAGCACAGCCCGGCGG 

WW W W 1. W W.T VW W WW X UU l/\± W W WX \ X W X U^VyA^OUAV lull U kV^ Wi ^Vi ^ ^ ^^^ 


633 . 


Qy 


481 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 




1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i.i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 




Db 


634 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


693 


Qy 


541 


GCAGT CAT G GAAT GC AGC AGT GT GCT GCCT GAGCT AGCCAAC C GCACAC G GCT CTT CT C A 


600 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 M 




Db 


&QA 


rc arTraTrra zvT^r^rranTnT^rTnrrTnAnrT APrrcAACCCiCACACGGnTCTTCTCA 


753 


Qy 


601 


GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 


660 


Db 


H A 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1.1 1 1 1 1 1 1 1 1 1 II 1 1 

r t 1 n t 1 p t" 1 r* a t p a a p p p t 1 p p p p a p a t p a p c t p T a t p p P A a P a T P T APP APAPtTTPtPTTPTTT 

bl wl bl bi/\ 1 b»>Vf\b.bjb. X brbjb;b./Ybr./ VX b»/ Vbb i b- 1 /ll bbw,/\rYw7A 1 w X LAu 1 X vj^X 1^111 


813 


Qy 


661 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 


720 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


pi /i 

O ±4 


/\X X by X b>/\b>b* X /\b»b X obwwwwriw X w br w b b X br-\. 1 bxwV~«\^.r\X UVj\»^v_. ini x x w>.rwj.r , \. x .r\x x wwuw 


873 


Qy 


721 


AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 


780 




1 I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 




Db 


o / 4 


ArVbrb- 1 bl br wbjbrb- wbiOb-i\bi/\l b.Ub.b,bjbxw/\b.b./\bb- X L-nbUnL X obr X uLbb/iaU X tjijnnuUljU 


jjj 


Qy 


781 


CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 


840 




1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


y o4 


pppmr" a n a ppa pp'tppppp appTPfzarzp a ^:p,^:pptp, apt ppa papppppaPtPP PPPrf'^f^P 

bbbl Ur\bj/\bb-/\bjb- X bjbjrbyoo\r\b»b» X bun.bbAUwwww 1 bnu X uun.bAbbUUbnb^Vwwuouuw 


993 


Qy 


841 


CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 


900 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


y y 4 


/-»r , r , r , ppT ,r PPP r PPPP i T , pa apt* pa appap ATPPPTPP APPPAPPA APAP AP^PP A APATPPTP 

wbjb-brUb- 1 1 b-O 1 brbrb- 1 b/in.bl bi/\/\br y^J\\jr\ X bj^brX w W./-V w br K3r\\J bAAw/\w/lb ^ b-/-\-rT. wr\ 1 x \J 


1053 


Qy 


901 


ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 


960 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1U04 


7v m^r , T , r*r^T i r , PT 1 PPT , PPTPTTPPPPPTPTCPT aPPTP,PPPATP Af^PHTPPTP A ATf^TPPTT 

/\X X wbr X bib- X brb X bibi 1L1 I LbLL^ 1 bl u^l ribw X bibb^V_,.tt.l 1 w w JL wrtn.1 o x v^v^ x x 


1113 


Qy 


961 


AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 


1020 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 1 1 A 

_L X JL 4 


A APAPPPTPTTPPPP ATPTTPPPPPAAPPPAPTPAPPP-CGAAGrTnTCTACGCCTGCTTC 

J\rWjJ\vj \j kj X o X X b« w w w/\ X bl 1 vUw\/v/An.o^ Lnu X wrA.w www w x \3 X v«» X rvwwv^w x w w x x w 


1173 


Qy 


1021 


ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 


1080 




1 I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 1 T/1 

XX. / 4 


APPTTPTPPPaPTPPPTPPTPT APPPP A AP APPPPTPPP A ACCCC ATP ATPTAP A ACT TC* 
/\w b» X X b» X bbb/ib X ijo^ X bb X b IrVbbbbrtrtbrtbbbb X bbbArVwbwv^rti w/ai w x nwrmw x x w 


1233 


Qy 


1081 


CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 


1140 




I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 




Db 


1 9 1A 


PT»pa " -"^pppaaaTTPPPPPaPP apttt A AP,r^PT^PPTTPTPPT^PT^PPTf4PPTnf^CTn 

X tnoi bJwL*r\rt/\X X ULuuunu^n.u XXX nr\bbb X b^b 1 Ivl w w i. w w x uV/V^ x www x w w w w x w 


1293 


Qy 


1141 


GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 


1200 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1294 


GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 


1353 


Qy 


1201 


T CCT T GCAGAGCCGAT GCT C C GT CT C CAAAAT CT CT GAGCAT GT GGT GCT CAC C AGC GT C 


1260 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1354 


TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 


1413 


Qy 


1261 


ACCACAGTGCTGCCCTGA 1278 





I II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 



Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 2 

US-10-225-567A-367 

; Sequence 367, Application US/10225567A 

; Publication No. US20030113798A1 

; GENERAL INFORMATION: 

; APPLICANT: Lifespan Biosciences 

; APPLICANT: Brown, Joseph P. 

; APPLICANT: Burmer, Glenna C. 

APPLICANT: Roush, Christine L. 
; TITLE OF INVENTION: ANTIGENIC PEPTIDES AND ANTIBODIES FOR G PROTEIN-COUPLED 
RECEPTORS (GPCRS) 
; FILE REFERENCE: 1920-4-4 

; CURRENT APPLICATION NUMBER: US/ 10/225, 5 67 A 
; CURRENT FILING DATE: 2001-12-19 

PRIOR APPLICATION NUMBER: 60/257,144 
; PRIOR FILING DATE: 2000-12-19 
; NUMBER OF SEQ ID NOS : 2292 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 367 

LENGTH: 1564 v 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-225-567A-367 

Query Match 99.7%; Score 1274.8; DB 15; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 0; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

III I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I 

Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 273 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 274 TACCCTW^ACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I 

Db 334 CT GGT GGGCAACACGCTGGTCT GCCl^G'GC CGT GTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I 
Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 



Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

I I I I I I I I II I I I I I I I I I I II I I I I I Ml Mill I 

Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 

Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 

Qy 601 GT CTGTGAT GAACGCT GGGCAGAT GAC CT CTAT CCCAAGATCTACCACAGTTGCTTGTTT 660 

I | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I LI I 

Db 754 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 813 



Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 84 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 934 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 993 

Qy 841 CGCGCCTT CCTGGCTGAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 994 CGCGCCTT CCTGGCT GAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCT G 1053 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 



Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCT^ACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1174 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1233 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1293 



Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

| | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1294 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1354 TCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1413 



Qy 1261 ACCACAGTGCTGCCCTGA 1278 

I I I I I I I I I I I I I I I I I I 
Db 1414 ACCACAGTGCTGCCCTGA 1431 



RESULT 3 

US-10-352-684A-21 

Sequence 21, Application US/10352684A 
Publication No. US20030215452A1 
GENERAL INFORMATION: 
APPLICANT: Millennium Pharmaceuticals Inc. 
APPLICANT: Carroll, Joseph M. 
APPLICANT: Healy, Aileen 
APPLICANT: Weich, Nadine S. 
APPLICANT: Kelly, Louise M. 

TITLE OF INVENTION: METHODS AND COMPOSITIONS FOR TREATING 

TITLE OF INVENTION: HEMATOLOGICAL DISORDERS USING 131, 148, 199, 12303, 
13906, 

; TITLE OF INVENTION: 15513, 17822, 302, 5677, 194, 14393, 28059, 7366, 12212, 
; TITLE OF INVENTION: 1981, 261, 12416, 270, 1410, 137, 1871, 13051, 1847, 
1849, 

TITLE OF INVENTION: 15402, 340, 10217, 837, 1761, 8990 OR 13249 MOLECULES 
FILE REFERENCE: MPI02-019P1RNOMNIM 
CURRENT APPLICATION NUMBER: US/ 10/352 , 684A 
CURRENT FILING DATE: 2003-01-28 
PRIOR APPLICATION NUMBER: US 60/354,333 
PRIOR FILING DATE: 2002-02-04 
PRIOR APPLICATION NUMBER: US 60/360,258 
PRIOR FILING DATE: 2002-02-28 
PRIOR APPLICATION NUMBER: US 60/364,476 
PRIOR FILING DATE: 2002-03-15 
PRIOR APPLICATION NUMBER: US 60/375,626 
PRIOR FILING DATE: 2002-04-26 
PRIOR APPLICATION NUMBER: US 60/386,494 
PRIOR FILING DATE: 2002-06-06 
PRIOR APPLICATION NUMBER: US 60/390,965 
PRIOR FILING DATE: 2002-06-24 
PRIOR APPLICATION NUMBER: US 60/392,480 
PRIOR FILING DATE: 2002-06-28 
PRIOR APPLICATION NUMBER: US 60/394,128 
PRIOR FILING DATE: 2002-07-03 
PRIOR APPLICATION NUMBER: US 60/399,783 
PRIOR FILING DATE: 2002-07-31 
PRIOR APPLICATION NUMBER: US 60/403,221 
PRIOR FILING DATE: 2002-08-13 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 62 

SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 21 
LENGTH: 1564 
TYPE: DNA 

ORGANISM: Homo Sapiens 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: (154) . . . (1431) 
US-10-352-684A-21 



Query Match 99.7%; Score 1274.8; DB 16; Length 1564; 

Best Local Similarity 99.8%; Pred. No. 0; 

Matches 1276; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 154 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 213 

Qy 61 T CCC CT GT GCCT CCAGACT AT GAAGAT GAGTTT CT CCGCTAT CT GT GGC GT GATT AT CT G 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 214 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 273 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGGC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 274 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 333 



Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I 

Db 334 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 393 

Qy 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | 

Db 394 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 453 



Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I 1 I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 454 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 513 

Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 514 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 573 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

Db 574 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 633 



Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 634 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 693 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

'Db 694 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 753 



Qy , 601 GT CTGT GAT GAACGCT GGGCAGATGACCT CTATCCCAAGATCTACCACAGTT GCTT CTTT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 754 GT CTGT GAT GAACGCT GGGCAGATGACCT CTAT CCCAAGATCTACCACAGTT GCTT CTTT 813 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 814 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 873 



Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 874 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 933 



Qy 

Db 



781 
934 



840 



993 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



841 CGCGCCTTCCTGGCT GAAGT GAAGC AGAT GC GT GCAC G GAG GAAGACAGCCAAGAT GCT G 900 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

994 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 1053 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1054 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 1113 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1114 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1173 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1174 ACCT T GT C CCACT GGCT G GT GT AC G CCAACAGCGC T GC CAAC C C CAT C AT CT ACAACT T C 1233 

1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1234 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 



1293 



1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
12 94 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1353 

12 01 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260- 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
1354 T CCT T GCAGAGC C GAT G CT C CAT CT CCAAAAT CT CT GAGCAT GT GGT G CTCACCAGC GT C 1413 

12 61 ACCACAGT GCT GCCCT GA 127 8 

I I I I I I I I I I I I I I I I I I 
1414 AC C ACAGT GCT GC C CT GA 1431 



RESULT 4 

US-09-826-509-548 

; Sequence 548, Application US/09826509 
; Publication No. US20030204073A1 
; GENERAL INFORMATION: 

; APPLICANT: Lehmann-Bruinsma, Karin 
; APPLICANT: Liaw, Chen W. 
; APPLICANT: Lin, I-Lin 

; TITLE OF INVENTION: No. US200302G^073A1-Endogenous, Cons titutively Activated 
Known G 

; TITLE OF INVENTION: Protein-Coupled Receptors 
; FILE REFERENCE: AREN-207 

CURRENT APPLICATION NUMBER: US/09/826,509 
; CURRENT FILING DATE: 2001-04-05 
; PRIOR APPLICATION NUMBER: 60/195,747 
; PRIOR FILING DATE: 2000-04-07 
; PRIOR APPLICATION NUMBER: 09/170,496 
; PRIOR FILING DATE: 1998-10-13 
; NUMBER OF SEQ ID NOS : 589 
; SOFTWARE: Patentln Version 2.1 



; SEQ ID NO 54 8 

LENGTH: 127 8 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-09-826-509-548 

Query Match 99.4%; Score 1270; DB 11; Length 1278; 

Best Local Similarity 99.6%; Pred. No. 0; 

Matches 1273; Conservative ' 0; Mismatches 5; Indels 0; Gaps 0; 
Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

1 1 I I I I 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 I I I I i 1 1 I I I I 1 1 I I 1 1 1 1 I I 1 1 1 1 I I I I 1 1 I I I 1 1 I 1 1 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 



Qy 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I \ I I I I I I I I 
Db 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 



Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 



Qy 241 ACCAACTACTTCATTGTCT^ACCTGTGCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 



Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGC?JVG 360 



Qy 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 fi 1 1 1 1 

Db 421 G CC CTGGACC GCT GGTAT GC C AT CT GC CAC C CACT ATT GT T CAAGAGCAC AGCC C GGC GG 480 



Qy 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 GCAGTCAT GGAATGCAGCAGT GT GCTGC CT GAGCTAGCCAACCGCACACGGCTCTTCT CA 600 



Qy 601 GT CT GT GAT GAAC GCTGGGCAGAT GAC CT CT AT C C CAAGAT CT AC CACAGT T GCT TCT T T 660 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 601 GT CT GT GAT GAAC GCT GGG CAGAT GAC CT CT AT C C CAAGAT CT AC CACAGTT GCTTCTTT 660 



Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 



Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 



780 
780 



Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 841 C GC GC CTT CCT GGCT GAAGTGAAGCAGAT GC GT GCAC GGAGGAAGACAAAAAAGAT GCT G 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

Qy 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 12 00 

Qy 1201 TCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1201 TCCTTGCAGAGCCGATGCTCCATCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTC 1260 

Qy 1261 AC CACAGT GCT GCC CT GA 1278 

I I I I I I I I I I I I I I I I I I 

Db 1261 AC CACAGT GCT GCCCTGA 127 8 



RESULT 5 
US-10-077-874-1 

Sequence 1, Application US/10077874 
Publication No. US20020115155A1 
GENERAL INFORMATION: 

APPLICANT: Soppet, Daniel et al 

TITLE OF INVENTION: Human Neuropeptide Receptor 
NUMBER OF SEQUENCES: 12 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Human Genome Sciences , Inc. 
STREET: 9410 Key West Avenue 
CITY: Rockville 
STATE: MD 
COUNTRY: USA 
ZIP: 20850 



COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/10/077 , 874 

; FILING DATE: 20-Feb-2002 

; CLASSIFICATION: <Unknown> 

PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/462,509 

FILING DATE: 05-JUNE-1995 
ATTORNEY/AGENT INFORMATION: 
NAME: Wales, Michele M. 
REGISTRATION NUMBER: 43,975 
REFERENCE/DOCKET NUMBER: PF168P1D1 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 301-309-8504 
; TELEFAX: 301-309-8439 

INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS : 

LENGTH: 1209 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
FEATURE : 
; NAME/ KEY: CDS 

; LOCATION: 1..1209 

SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
US-10-077-874-1 

Query Match 94.5%; Score 1207.4; DB 14; Length 1209; 

Best Local Similarity 99.9%; Pred. No. 0; 

Matches 1208; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 AT GGAGC C CT CAGCC ACC CCAGGGGC CCAGAT GGG G GT C CC C C CT GGC AGCAGAGAGC C G 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

Qy 61 TCCCCTGTGCCTC C AGACT AT GAAGAT GAGT T T CT C C GCT AT CT GT GG CGT GAT T AT CT G 120 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 




Db 



121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 



Qy 



181 CT GGT GGGCAACAC GCT GGTCTGCCTGGCCGTGT GGCGGAAC CAC C AC AT GAGGACAGT C 240 




Db 



181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 



QY 



241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 




Db 



241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 



Qy 



301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 



360 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



Db 


•J VJ X 


rrr^r^rrA^rrTnpT^f^Tnr^ArATrArTnAGTrrTGGrTGTTCGGCCATGCCCTCTGCAAG 


360 


Qy 


361 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


420 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 1 1 1 




Db 


361 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


420 


Qy 


421 


GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTGAAGAGCACAGCCCGGCGG 


480 




1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 




Db 




rrrr*vrr Arrrr r rrr r vT\ r rrrr flTPTrrrArrr apt atthttc a APAPP AC AGPPCGGCGG 


480 


Qy 


481 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 


Db 


A Q 1 

4 o 1 


1 I 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 

bL-L-UljlbuL-I LL-rtl LLJ. bxLrLjL^/\l k_»x bJb7vjb> ± ulul v_. wU x uuLLrti ± x o ^ \-*r^.\-r v_- j. 


540 


Qy 


541 


GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 


600 




I | | | | | | | I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


^ A 1 


r r* z\ r *v r n t r r w zi T c.r i\ rzr z\ czt cz<v pp T r;r p T P A PP T A CC C A A P C P P A P AC PPPTPTTPTCA 


600 


Qy 


601 


GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 


660 




I I 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


bU 1 


r*rvr ir rr' r vr*7\ r vr'7\ nrrrTrrrraraTrarrTrT ATPPP A APATPT APP AP APTTPPTTPTTT 

bl LI bl b;/\X bnnbbb 1 objL?L^M.LxrVl Lj/\L^ U 1 L^ X .r\± UUU.ttxT.o.tt.X L> X .r\L^ v».ry\^rvo x x x x v^. x x x 


660 


Qy 


661 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 


720 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


(Z C 1 
DDI 


Al 1 1 L-ALblALb 1 LjLjL.L^L^L-/\L. 1 bbbLL 1 L-/\x bibrb-L^/-vl buLLlrtl l X UL.r\.oAlrt.i x v_-uol* 


720 


Qy 


721 


AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 


780 




I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


/ ZL 


7\ n rrTPTrrrrrrrrrarzvTrrrrrrrarrzvrrTrA^rArTririTnrR^AArT^f^AAnrfir 

i b 1 LjoL3LjUUvjUL^jrVVj/\l LLLL-Lib L^/\L^ L^.rt_L^ L- X Ur\OL«AL. X Lrul vjL/UOnn^ x vjo.fvrvj^'J^ 


780 


Qy 


781 


CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 


840 




II I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 




Db 


/ o 1 


c c* r T 1 r* ft r a r r* a r r* t r c c c r z\ c* P T c P A P P A C P p p p T P A P T P P A PAPPPPPAPPPPPPPPPP 

bbbi brtbAbbnbbl LrrbrbJbrLxrvL^L^ X Lj U/ oLtL-Ul \jrvj X OunurtUV/U^^AO\^^^^uuu^^ 


840 


Qy 


841 


CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 


900 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


P A 1 
O *1 J_ 


prprrrTTPPTPPPTPAAPTPAAPPAPATPPPTPPAPPPAPPAAPAPAGCCAAGATGCTG 


900 


Qy 


901 


ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 


960 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


q n i 


ZiTrPTPPTPPTPPTPPTPTTPPPPPTPTPPTAPPTPPPPATCAGCGTCCTCAATGTCCTT 


960 


Qy 


961 


AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 


1020 




1 I I I I I I I I I II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


Q £1 


z\ar APPPTPTTPf^PPATt^TPPPPP A A(^PP AGTGAPPGPGAAGPTGTPT APGPPTGCTTC 

/\t\\jt\Kd kd br 1 bl 1 vObbAl uj. 1 UL»LiL»L'rtrt.vjUL'Au X K3 v — Vj/^rvvj rvvuv^V/ ± ± ± ^ 


1020 


Qy 


1091 


appttptpppaptphpt^;p;tgtapgppaapagpgptgcpaaccccatcatctacaacttc 


1080 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 




Db 


1021 


accttctcccactggctggtgtacgccaacagcgctgccaaccccatcatctacaacttc 


1080 


Qy 


1081 


ctcagtggcaaattccgggagcagtttaaggctgccttctcctgctgcctgcctggcctg 


1140 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1081 


ctcagtggcaaattccgggagcagtttaaggctgccttctcctgctgcctgcctggcctg 


1140 



Qy 1141 



GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



1200 



Db 



1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 



Qy 1201 TCCTTGCAG 1209 

I I I I I I II 
Db 1201 TCCTTGTAG 1209 



RESULT 6 
US-09-393-696-1 

; Sequence 1, Application US/09393696 
; Publication No. US20030022277A1 
; GENERAL INFORMATION: 

; APPLICANT: Human Genome Sciences , Inc. et al . 
; TITLE OF INVENTION: Human Neuropeptide Receptor 
; FILE REFERENCE: PF168P2 

; CURRENT APPLICATION NUMBER: US/09/393, 696 

; CURRENT FILING DATE: 1999-09-10 

; EARLIER APPLICATION NUMBER: PCT/US95/ 05616 

; EARLIER FILING DATE: 1995-05-05 

; EARLIER APPLICATION NUMBER: US08/462,509 

; EARLIER FILING DATE: 1995-06-05 

; NUMBER OF SEQ ID NOS : 23 

SOFTWARE: Patentln Ver. 2.0 
; SEQ ID NO 1 
; LENGTH: 1209 
; TYPE: DNA 
; ORGANISM: Homo sapiens 

FEATURE: 

NAME/ KEY: CDS 

LOCATION: (1)..(1209) 
US-09-393-696-1 

Query Match 94.0%; Score 1201; DB 10; Length 1209; 

Best Local Similarity 99.6%; Pred. No. 0; 

Matches 1204; Conservative 0; Mismatches 5; Indels 0; Gaps 0; 

1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

| | | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I M I 

1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

61 T CCC CT GT GCCT CCAGACT AT GAAGAT GAGTT T CT C CGCT AT CT GT GGCGT GATTAT CT G 120 

| | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
61 T CCCCTGT GCCT CCAGACT AT GAAGATGAGTTTCT CCGCTAT CTGT GGCGTGATTATCT G 120 

121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

Li m I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I . i I MM 

121 TACCCAAAACAGTATGAGTGGGTCCTCATCCCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I | | | | I M Ml I I I I II I I I I II II I I I II I I I I I I I I M M M I I M I I I I I I I I I I I I 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 
241 ACCAACT ACTT CATT GT CAACCT GT CCCT GGCT GACGTT CT GGT GACT GCT AT CT GCCT G 300 

I | | | | | I I I I I II II I M I M I I II I II I I I I I I I I II I M I II I II M M 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 
301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 
361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

481 GCCCGTGGCTCGATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

| I I I I I I I I I I II I I I I II I I I I I I I II I I I I I I I I I I I I I M I I I I I I I 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 'M I 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 

601 GT CTGT CAT GAACGCTGGGCAGATGACCT CTAT CCCAAGAT CTACCACAGTTGCTTCTTT 660 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

| | I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

721 AACCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAG'CGC 780 

7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

| | I I I I I I I I I I I I I I I I I I I I I M I M I M I I M I I I I N I I I I I I I I I I I I I I I I I I 

781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

841 CGCGCCTTCCTGGCT GAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCTG 900 

I I I I I I I I I I I I I I M I I I II I I II M I I I II I I I I I I I I I I I 

841 CGCGCCTT CCT GGCT GAAGTGAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCT G 900 

901 ATGGTGGTGCTGCTGGTCTTCGCCCT.CTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTC7^ATGTCCTT 960 

961 AAGAGGGT GT T C GGGAT GT T C C GC C AAGC C AGT GAC C G C GAAG CT GT CT AC GCCTGCTTC 1020 

| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

| | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

1081 CTCAGTGGCAAATTCCGGGAGCAGTTTT^GGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 114 0 

1 1 1 1 1 1 1 1 1 1 1 ii i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1081 CTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCTGGCCTG 1140 

1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 
| | | | | | | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 1141 GGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTG 1200 



Qy 1201 
Db 1201 



TCCTTGCAG 1209 

I I I I I I II 

TCCTTGTAG 1209 



RESULT 7 
US-10-077-874-3 

; Sequence 3, Application US/10077874 
; Publication No. US20020115155A1 

GENERAL INFORMATION: 
; APPLICANT: Soppet, Daniel et al 

TITLE OF INVENTION: Human Neuropeptide Receptor 
NUMBER OF SEQUENCES : 12 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Human Genome Sciences, Inc. 

; STREET: 9410 Key West Avenue 

CITY: Rockville 
STATE: MD 
; COUNTRY: USA 

ZIP: 20850 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.30 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 10/077 , 874 
FILING DATE: 20-Feb-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: 08/462,509 

; FILING DATE: 05-JUNE-1995 

ATTORNEY/AGENT INFORMATION: 
NAME: Wales, Michele M. 
REGISTRATION NUMBER: 43,975 
REFERENCE/ DOCKET NUMBER: PF168P1D1 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 301-309-8504 
TELEFAX : 301-309-8439 
; INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 1110 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS : single 
; TOPOLOGY: linear 

MOLECULE TYPE: DNA (genomic) 
; FEATURE : 

; NAME/ KEY: CDS 

LOCATION: 1..1110 
SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
US-10-077-874-3 

Query Match 85.0%; Score 1085.8; DB 14; Length 1110; 

Best Local Similarity 99.8%; Pred. No. 6.6e-295; 

Matches 1087; Conservative 0; Mismatches 2; Indels 0; Gaps 0; 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 AT GGAGC C CT CAGC CAC C C C AGGGGC C CAGAT GG GGGT CCC CCCT GGC AG C AGAGAG C C G 60 

61 TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
61 TCCCCTGTGCCTC C AGACTAT GAAGAT GAGT TT CT C C GCTAT CT GTGGCGT GAT T AT CT G 120 

121 TAC C CAAAAC AGT AT GAGT G GGT C CT C ATC GCAGC CT AT GT GGCT GT GTT C GT CGT GG CC 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | 
241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTpCTATCTGCCTG 300 

301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II 
301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

361 GTCATCCCCTATCT ACAGGCT GTGT CCGTGT CAGTGGCAGT GCTAACTCTCAGCTTCAT C 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
361 GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 420 

421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 4 80 

4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I I I I I I I I I I I I I I I I I I I I I I I I I 

541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

601 GTCTGTGATGAACGCTGGGCAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTT 660 

I I I I I I I I I I I I I I I I I i I I I I I I I I II I I I I I I I I III I I I I I I I I I I I I I I I I I I I | I 
601 GT CT GT GATGAACGCT G GGC AGAT GAC CT CT AT C C CAAGAT CTAC CACAGTT GCT T CTT T 660 

661 ATTGTCkCCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCoC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | 
661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 7 80 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 



Qy 

Db 



841 
841 



900 



900 



Qy 



901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 





Db 



901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTCCCCATCAGCGTCCTCAATGTCCTT 960 



Qy 



Db 



961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 102 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 102 0 



Qy 



1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 




Db 



1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 



Qy 



1081 CTCAGTGGC 1089 



Db 



1081 CTCAGTGGC 1089 



RESULT 8 
US-10-077-874-5 

; Sequence 5, Application US/10077874 
; Publication No. US20020115155A1 

GENERAL INFORMATION: 
; APPLICANT: Soppet, Daniel et al 

; TITLE OF INVENTION: Human Neuropeptide Receptor 

NUMBER OF SEQUENCES : 12 
; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: Human Genome Sciences, Inc. 

; STREET: 9410 Key West Avenue 

CITY: Rockville 

STATE: MD 

COUNTRY: USA 
; ZIP: 20850 

COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE : Patentln Release #1.0, Version #1.30 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 10/077 , 874 
FILING DATE: 20-Feb-2002 
CLASSIFICATION: <Unknown> 
; . l^RIOR APPLICATION DATA: 

; APPLICATION NUMBER: 08/462,509 

; FILING DATE: 05-JUNE-1995 

; ATTORNEY/ AGENT INFORMATION: 

; NAME: Wales, Michele M. 

; REGISTRATION NUMBER: 43,975 

REFERENCE/DOCKET NUMBER: PF168P1D1 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: 301-309-8504 
'TELEFAX: 301-309-8439 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 



LENGTH: 1116 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: DNA (genomic) 
* FEATURE: 

NAME/ KEY: CDS 
LOCATION: 1..1116 
SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
US-10-077-874-5 

Query Match 84.8%; Score 1083.2; DB 14; Length 1116; 

Best Local Similarity 99.7%; Pred. No. 3.6e-294; 

Matches 1085; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 
Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGACCCC 60 

Qy 61 TCCCCTGTGCCTC CAGACT AT GAAGAT GAGTT T CT C C GCT AT CT GT GGCGT GAT TAT CT G 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I 
Db 61 T C C CCT GT GCCT C CAGACT AT GAAGAT GAGTT T CT C C GCTAT CT GT G GC GT GAT TAT CT G 120 

Qy 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 180 

I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 121 TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 18 0 

Qy 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 240 

I I I I I II I II I I I I I II I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 

Db 181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 24 0 

Qy 241 ACCTVACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 300 

Qy 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I II I 
Db 301 CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 360 

Qy 361 GT CAT CCCCTAT CTACAGGCTGTGT CCGT GTCAGT GGCAGTGCTAACTCTCAGCTT CAT C 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 361 GT CAT CCCCTAT CTACAGGCTGT GT CCGT GT CAGT GGCAGT GCT AACTCT CAGCTTCAT C 420 

Qy 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 48 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 480 

Qy 4 81 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 540 

Qy 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I 
Db 541 GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 600 



Qy 



601 GT CT GT GAT GAACGCTGGGCAGAT GACCT CT AT C C CAAGAT CT ACCACAGT TGCTTCTTT 
M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 



660 



Db 601 GT CTGT GATGAACGCTGGGCAGATGACCT CTAT CCCAAGATCTACCACAGTTGCTT CTTT 660 

Qy 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M 

Db 661 ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 720 

Qy 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAAGTGGAAGCGC 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 

Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 84 0 

Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I II I I I I I II I I I I 
Db 841 CGCGCCTTCCTGGCTGAAGT GAAGCAGAT GCGT GCAC GGAGGAAGACAGCCAAGAT GCT G 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGAC'CGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 1 I I I I I I I I I I I I I I I II I II I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGG 1088 

II I I I I I I 

Db 1081 CTCAGTGG 1088 



RESULT 9 
US-09-393-696-5 

; Sequence 5, Application US/09393696 
; Publication No. US20030022277A1 
; GENERAL INFORMATION: 

; APPLICANT: Human Genome Sciences, Inc. et al . 
; TITLE OF INVENTION: Human Neuropeptide Receptor 
; FILE REFERENCE: PF168P2 

; CURRENT APPLICATION NUMBER: US/09/393,696 

; CURRENT FILING DATE: 1999-09-10 

; EARLIER APPLICATION NUMBER: PCT/US95/05616 

; EARLIER FILING DATE: 1995-05-05 

; EARLIER APPLICATION NUMBER: US08/462,509 

; EARLIER FILING DATE: 1995-06-05 

; NUMBER OF SEQ ID NOS : 23 

; SOFTWARE: Patentln Ver. 2.0 

; SEQ ID NO 5 

LENGTH: 1133 
; TYPE: DNA 
; ORGANISM: Homo sapiens 
; FEATURE: 

NAME/ KEY: CDS 



LOCATION: (1)..(1131) 
US-09-393-696-5 



Query Match 84.8%; Score 1083.2; DB 10; Length 1133; 

Best Local Similarity 99.7%; Pred. No. 3.6e-294; 

Matches 1085; Conservative 0; Mismatches 3; Indels 0; Gaps 0; 



Qy 


1 


aTrrarrrrTra^rrarrrra^^crrrAriAT^^^CTrrrrrrTfi^rAGCAGAGAGCCG 


60 




I 1 1 1 i 1 1 1 1 1 1 i 1 1 1 1 1 1 i 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 i 1 1 1 1 1 i 1 1 II 

II II II It 1 II It 1 1 1 II M 1 1 II 1 1 1 1 1 II 1 1 1 M II 1 II 1 II 1 II 1 1 1 1 1 M I I ll 




Db 


1 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGACCCC 


60 


Qy 




f T i r , rrr ,r rr r r i rrrTr , rararTaTraaraTnz\nTTTr , Trrr;rTATrTnTnnrnTGATTATCTG 

1 LULL 1 bl bLLl K3J-\r\\3J-\L bA.vjl 11^1 ^^o^l-rVX ^ J. \j L oujVwVjj. erj^.x xj-\± 


120 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 
I M IS 1 M 1 M 1 II 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 


120 


Qy 


121 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


TACCCAA7y\CAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 


Qy 


181 


CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


zi i yj 




I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 I 1 II 




Db 


181 


CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


240 


Qy 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 




I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 


Qy 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 




|| | | | | | II | | 1 II 1 1 1 1 1 II 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 




Db 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 


Qy 


361 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


420 




I | I 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 M 




Db 


. 361 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


420 


Qy 


421 GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


480 




I 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 




Db 


421 


GCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


480 


Qy 


481 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 




1 1 1 1 1 II I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


481 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 


Qy 


541 


GCAGTCATGGAATGCAGCAGTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCA 


600 




MINIMUM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 * "&CAGT CATGGAAT GCAGCAGTGTGCTGCCT GAGCTAGCCAACCGCACACGGv^? CTT CTCA 


600 


Qy 


601 


GT CTGT GAT GAACGCT GGGCAGATGACCTCTAT CCCAAGATCTACCACAGTTGCTTCTTT 


660 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II II 




Db 


601 


GT CTGT GATGAAC GCT GGGCAGATGACCT CTATCCCAAGATCTACCACAGTT GCTT CTTT 


660 


Qy 


661 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 


720 




| | I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


661 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 


720 


Qy 


721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 


780 




| | | I | I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 





Db 



721 AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 780 



Qy 781 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 840 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I > I 

Db 7 81 CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 840 

Qy 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I ! I I I I I M I I I II I I I ! I I I I I I I I I I I 

D b 841 CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 900 

Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I M I I M I I I I I I I I I I I I I I I I I I I I 

Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

Qy 961 AAGAGGGTGTTGGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

| | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II Ill 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 108 0 

Qy 1081 CTCAGTGG 1088 



1081 CTCAGTGG 1088 



RESULT 10 
US-09-730-931-1 

; Sequence 1, Application US/09730931 

; Patent No. US20020064814A1 

; GENERAL INFORMATION: 

; APPLICANT: ELLIS, CATHERINE E. 

; TITLE OF INVENTION: DOG OREXIN 1 RECEPTOR 

; FILE REFERENCE: GH-70669 

; CURRENT APPLICATION NUMBER: US/ 09/730 , 931 

; CURRENT FILING DATE: 2000-12-06 

; PRIOR APPLICATION NUMBER: 60/169,373 

; PRIOR FILING DATE: 1999-12-07 

; NUMBER OF SEQ ID NOS : 2 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 1 

LENGTH: 1281 
; TYPE: DNA 

ORGANISM: CANIS FAMILIARIS 
US-09-730-931-1 . 

Query Match 84.7%; Score 1083; DB 9; Length 1281; 

Best Local Similarity 90.9%; Pred. No. 4.2e-294; 

Matches 1165; Conservative 0; Mismatches 110; Indels 6; Gaps 1; 

Qy 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I 

Db 1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGACTGGGACCCCCACCGGCGGCGGGGAGCTG 60 

Qy 61 TCCCCT GT GCCT CCAGACT AT GAAGAT GAGT T T CT C CG CT AT CT GT GGC GT GAT 114 

|| || I II I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I III 



Db 


61 


TCTCCGTCACTGGTGCCTCCCGACTATGAAGACGAGTTCCTGCGCTATCTGTGGCGCGAT 


120 


Qy 


115 


TAT CT GT AC C C AAAAC AGT AT G AGT G G GT C C T CAT C GC AGC C T AT GTGGCTGTGTTCGTC 


174 




II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 TACCTGTACCCAAAGCAGTATGAGTGGGTCCTCATCGCTGCCTACGTGGCTGTGTTCCTA 


180 


Qy 


175 GTGGCCCTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGG 


234 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill 




Db 


181 


GT GGCCCT GGT GGGCAACACGCT GGT CT GCCT GGCCGT GTGGAGGAACCACCACAT GAGG 


240 


Qy 


235 ACAGTCACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATC 


294 




II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II MINIM II II 1 




Db 


241 ACGGTCACCAACTATTTCATTGTCAACCTGTCCCTGGCTGATGTGCTGGTGACAGCCATC 


300 


Qy 


295 


TGCCTGCCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTC 


354 




1 1 1 1 1 MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II III II LI 1 




Db 


301 


TGCCTCCCGGCCAGCCTGCTGGTAGACATCACTGAGTCCTGGCTCTTCGGTCATACCCTC 


360 


Qy 


355 


TGCAAGGTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGC 


414 




I | | | 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II Mill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


TGCAAAGTCATCCCCTACCTACAGGCCGTGTCTGTGTCGGTGGCAGTGCTGACTCTCAGC 


420 


Qy 


415 


TTCATCGCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCC 


474 




I I 1 1 1 1 1 1 1 II 1 II 1 II 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 III 




Db 


421 


TTCATCGCCCTGGACCGCTGGTATGCCATCTGCCACCCGCTGTTGTTCAAGAGCACCGCC 


480 


Qy 


475 


CGGCGGGCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCC 


534 




Mill 1 1 II 1 1 II 1 II 1 1 II 1 1 1 M 1 1 M 1 1 M II II 1 II M II 1 II 1 1 II 




Db 


481 


CGGCGCGCCCGCAGCTCCATCCTGGGCATCTGGGCTGTGTCATTGGCTGTCATGGTACCT 


540 


Qy 


535 


C AGGCT G CAGT CAT GGAAT GCAGCAGT GTGCT GC CT GAGCT AGC CAAC CGC ACAC GGCTC 


594 




M I I I II I M II 1 II 1 1 1 1 1 II 1 1 1 MM 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 II M III 




Db 


541 


CAGGCTGCCGTCATGGAATGCAGCAGCGTGCTCCCTGAGCTAGCCAACCGCACCCGCCTC 


600 


Qy 


595 


T T CT CAGTCT GT GAT GAAC GCT GGGCAGAT GACCT CT AT CC CAAGAT CT AC CAC AGT T GC 


654 




II II I II 1 1 1 II 1 II II 1 II II II M M II 1 II 1 1 M 1 II 1 II II 1 II 1 II 1 M 1 II 




Db 


601 


T T CT CT GT GT GTGAT GAAC ACT GGGCAGAT GAC CT CT AT C C CAAGAT CT AC CAC AGTT GC 


660 


Qy 


655 TTCTTTATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATA 


714 




M | I I 1 1 II II 1 II II 1 1 1 1 II 1 II II 1 1 1 II II 1 1 II II II II 1 1 1 M 1 M II 1 1 1 




Db 


661 


TTCTTCATTGTCACCTACTTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATC 


720 


Qy 


715 


TTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGG 


774 




II | | | II 1 II II 1 1 1 II 1 1 1 II 1 1 1 II II 1 1 1 II 1 1 1 II II 1 II 1 1 1 1 1 M 1 II 1 




Db 


721 


TTCCGCAAGCTCTGGGGCCGCCAGATCCCTGGCACCACATCGGCCCTGGTGAGGAACTGG 


780 


A,; 

Qy 


775 AAGCGCCCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCC 


834 


Db 


1 I 1 1 1 1 1 1 1 1 II 1 1 1 II 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 II 

781 AAGCGGCCCTCGGACCAGCTGGAGGACCAGGGGCCCGGCCTGAGCGCGGAACCCCCCCCT 


840 


Qy 


835 


CG GGCC C GCGCCT T CCT GG CT GAAGT GAAGCAGAT GC GT GCAC GGAGGAAGAC AGC CAAG 


894 




1 1 II 1 1 II II II II 1 II 1 1 1 II 1 II II II 1 1 1 II II 1 1 M II II 1 II II MUM 




Db 


841 


CGGGCCCGGGCCTTCCTGGCTGAGGTGAAGCAGATGCGAGCGCGGAGGAAGACGGCCAAG 


900 


Qy 


895 


ATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAAT 


954 




| II II 1 1 1 1 1 II 1 II 1 II II 1 II 1 1 1 1 II II 1 II 1 1 1 1 II 1 1 II 1 II 1 1 1 M 1 1 1 1 1 




Db 


901 ATGCTGATGGTGCTGCTGCTGGTCTTTGCCCTCTGCTACCTGCCCATCAGTGTCCTCAAT 


960 



Qy 


955 


Db 


961 


Qy 


1015 


Db 


1021 


Qy 


1075 


Db 


1081 


Qy 


1135 


Db 


1141 


Qy 


line 
1195 


Db 


1201 


Qy 


1255 


Db 


1261 



GTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCC 1014 

I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II II I I I I I I 

GTCCTCAAGAGGGTGTTCGGGATGTTCCGCCAATCCAGTGACCGAGAAGCCGTGTACGCC 1020 

TGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTAG 1074 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 

TGCTTCACCTTCTCCCACTGGCTGGTGTATGCCAACAGCGCTGCCT^ACCCCATCATCTAC 1080 

AACTTCCTCAGTGGCAAATTCCGGGAGCAGTTTAAGGCTGCCTTCTCCTGCTGCCTGCCT 1134 

I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

AACTTCCTCAGCGGCAAATTCCGGGAGCAGTTTAAGGCCGCCTTCTCCTGCTGCCTGCCT 1140 - 

GGCCTGGGTCCCTGCGGCTCTCTGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAG 1194 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

GGCCTGGGTCCCTGCGGCTCTCCGAAGGCCCCCAGCCCCCGCTCCTCTGCCAGCCACAAG 1200 

TCCTTGTCCTTGCAGAGCCGATGCTCCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACC 1254 

M I I I I I I I I I I I I I II I I I I I I I I I I I I Ml I I III I I I I I I I I I I I I II I II 

TCCTTGTCCTTGCACAGCCGGTGCTCCGTCTCCAAAGTCCCCGAGCACGTGGTGCTCACC 12 60 

AGC GT CAC CAC AGT GCT GCCC 1275 
I I I I I I I I I I I I I I II I I I 
AGTGTCACCACGGTGCTGCCC 1281 



RESULT 11 
US-09-393-696-3 

Sequence 3, Application US/09393696 
Publication No. US20030022277A1 
GENERAL INFORMATION: 
APPLICANT: Human Genome Sciences,, Inc. et al . 
TITLE OF INVENTION: Human Neuropeptide Receptor 
FILE REFERENCE: PF168P2 

CURRENT APPLICATION NUMBER: US/09/393,696 
CURRENT FILING DATE: 1999-09-10 
EARLIER APPLICATION NUMBER: PCT/US95/05616 
EARLIER FILING DATE: 1995-05-05 
EARLIER APPLICATION NUMBER: US08/462,509 
EARLIER FILING DATE: 1995-06-05 
NUMBER OF SEQ ID NOS : 23 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 3 
LENGTH: 1110 
TYPE: DNA 

ORGANISM: Homo sapiens !; " 
FEATURE : 
NAME/ KEY: CDS 
LOCATION: (1) . . (1110) 
US-09-393-696-3 

Query Match 84.3%; Score 1077.8; DB 10; Length 1110; 

Best Local Similarity 99.4%; Pred. No. 1.2e-292; 

Matches 1082; .Conservative 0; Mismatches 7; Indels 0; Gaps 0; 



Qy 



1 ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 60 
| | | I I I II I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I 



Db 


1 


ATGGAGCCCTCAGCCACCCCAGGGGCCCAGATGGGGGTCCCCCCTGGCAGCAGAGAGCCG 


60 


Qy 


61 


T CCC CTGT GC CT CC AGACT AT GAAGAT GAGT T T CT C C GCT AT CT GT GG CGT GATT ATCT G 


120 




1 1 1 1 1 1 1 1 1 I I I 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


61 


TCCCCTGTGCCTCCAGACTATGAAGATGAGTTTCTCCGCTATCTGTGGCGTGATTATCTG 


120 


Qy 


121 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 




1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


121 


TACCCAAAACAGTATGAGTGGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCC 


180 


A,, 

Qy 


181 CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


240 




i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i \ \ i i i i i i i i i i i i i 
1 I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 




Db 


181 


CTGGTGGGCAACACGCTGGTCTGCCTGGCCGTGTGGCGGAACCACCACATGAGGACAGTC 


240 


Qy 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACbl 1 LI bbl bAblbbl AI bl bbbl b 


300 




• i i i i i ■ i i i i i i i i i i i i i i i i i I I I I 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 
| | | | | || | | | | | | | | | | | | 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


241 


ACCAACTACTTCATTGTCAACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTG 


300 


Qy 


301 


CCGGCCAGCCT GCT GGTGGACATCACT GAGT CCTGGCTGTTCGGCCA1 bbbb 1 b 1 bbAAb 


360 




■ i i i i ■ i i i i i i i i i i i i i i i t i i t i i i i i i i i i a i i i i i i i i i i i i 

I | | | | | 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 J 1 1 i 1 1 1 1 1 1 LI 1 1 




Db 


301 


CCGGCCAGCCTGCTGGTGGACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAG 


360 


Qy 


361 


GT C AT CC C CT AT CT ACAG GCT GT GT CC GT GT CAGT GGC Ab 1 bb 1 AAb 1 b 1 b Abb 1 1 bAl b 


420 




r I I ■ 1 1 | 1 1 1 1 1 1 1 1 1 1 | 1 | 1 1 1 1 1 1 1 1 I 1 1 1 1 | 1 1 1 | 1 1 I 1 1 I 1 1 1 | 1 1 1 1 

| | | | | | | l l l l l l l l l l l l l M l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l l 1 1 1 1 




Db 


361 


GTCATCCCCTATCTACAGGCTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATC 


420 


A,, 

Qy 


421 


rt^--,-,-,,-,. rtriAAm/^Am7vmAAAT»mAmAr i r i 7\r , r , /^A ArpA^^r^^pAAr Ar^Ar Arrrrrrrrr 
GCCCTGGACCGCTGGTATGCCATCTGCCACCbAbl Al Ibl 1 LAAbAbLALAbLLLbbLbb 


480 




i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i I i I I I i I l I I I I I I I I 1 1 1 1 1 
I | 1 I I I I I I I 1 M 1 1 1 II 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 M 1 II II 1 1 1 1 1 M 1 I I 




Db 


421 


CCCCTGGACCGCTGGTATGCCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGG 


480 


Qy 


481 


GCCCGTGGCTCCATCCTGGGCATbl bbbbl blbl bbbl bbbbAl bAl bbl bbbbb^vLiLj^i 


540 




1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

II 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 




Db 


481 


GCCCGTGGCTCCATCCTGGGCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCT 


540 


Qy 


541 


GCAGTCATGGAATGCAGCAbl GI bb 1 bbbl bAbb 1 AbbbAAbbbbAbAbbuu iLiiLi ua. 


600 




i i i i i i i \ i iii i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 1 1 t 1 1 1 1 1 1 1 1 i 1 1 1 1 
I I I I I I 1 1 1 III 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


GCAGT CAT GCAAT C CAGCAGT GT G CT GCCT GAGCTAGCCAACC GCACACGGCT CT T CT CA 


600 


yy 


601 


bl b 1 bl bAl bAAbbbi bbbbAbAl bAbbl b 1 Al V^U^i-Vtt.'ji.H.i LlAL^ALAUl -L VjV^ ll^JLll 


660 




I I I I I I I I l I I I I I I 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 i 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
| | | | 1 I 1 1 I 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


601 


bl bl bl bAl bAAbbbi bbbbAbAl bAbLlblAl ^ W ^J\t\\jl\ 1 ^ 1 rW-* \^r\\^r\\j L 1 11^111 


660 


Qy 


661 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 


720 




I | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


661 


ATTGTCACCTACCTGGCCCCACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGC 


720 


Qy 


721 


AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 


780 




I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


721 


AAGCTCTGGGGCCGCCAGATCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGC 


780 


Qy 


781 


CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCC 


840 




I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


781 


CCCTCAGACCAGCTGGGGGACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGGC 


840 


Qy 


841 


C GCGCCTT C CT GGCT GAAGT GAAGC AGAT GCGT GCACG GAGGAAGACAG CCAAGAT GCT G 


900 




I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


841 


CGCGCCTTCCTGGCTGAAGTGAAGCAGATGCGTGCACGGAGGAAGACAGCCAAGATGCTG 


900 



Qy 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTT 960 

I I I I I I I II I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 901 ATGGTGGTGCTGCTGGTCTTCGCCCTCTGCTACCTCCCCATCAGCGTCCTC7VATGTCCTT 960 

Qy 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 961 AAGAGGGTGTTCGGGATGTTCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTC 1020 

Qy 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1021 ACCTTCTCCCACTGGCTGGTGTACGCCAACAGCGCTGCCAACCCCATCATCTACAACTTC 1080 

Qy 1081 CTCAGTGGC 1089 

MINIMI 

Db 1081 CTCAGTGGC 1089 



RESULT ,12 
US-10-278-087A-55 

Sequence 55, Application US/10278087A 
Publication No. US20030138817A1 
GENERAL INFORMATION: 

APPLICANT: Shuji Hinuma 
Yasuaki Ito 
Ryo Fujii 

TITLE OF INVENTION: G Protein Coupled Receptor Protein, 

Production, And Use Thereof 
NUMBER OF SEQUENCES: 61 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Edwards & Angell, LLP 
STREET: 101 Federal Street 
CITY: BOSTON 
STATE: MA 
COUNTRY: USA 
ZIP: 02209 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 10/278 , 087A 
FILING DATE: 31-Jan-2003 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: , * 

APPLICATION NUMBER: 09/461,436 
FILING DATE: 14-DEC-1999 
APPLICATION NUMBER: 09/038,572 
FILING DATE: ll-MAR-1998 
APPLICATION NUMBER: 08/513,974 
FILING DATE: 14-SEP-1995 
APPLICATION NUMBER: PCT/JP95/ 01599 
FILING DATE: 10-AUG-1995 
APPLICATION NUMBER: 7-093989 
FILING DATE: 19-APR-1995 
APPLICATION NUMBER: 7-057186 



FILING DATE: 16-MAR-1995 
; APPLICATION NUMBER: 7-007177 

FILING DATE: 20-JAN-1995 
APPLICATION NUMBER: 6-326611 
FILING DATE: 2 8-DEC-1994 
APPLICATION NUMBER: 6-270017 
FILING DATE: 02-NOV-1994 
APPLICATION NUMBER: 6-236357 
; FILING DATE: 30-SEP-1994 

APPLICATION NUMBER: 6-236356 
FILING DATE: 30-SEP-1994 
APPLICATION NUMBER: 6-189274 
FILING DATE: ll-AUG-1994 
APPLICATION NUMBER: 6-189273 
FILING DATE: ll-AUG-1994 
; APPLICATION NUMBER: 6-189272 

; FILING DATE: ll-AUG-1994 

ATTORNEY/AGENT INFORMATION: 
NAME: CONLIN, DAVID G. 
REGISTRATION NUMBER: <Unknown> 
REFERENCE/ DOCKET NUMBER: 45753 DIV3 
; TELECOMMUNICATION INFORMATION: 

; TELEPHONE: 617-439-4444 

TELEFAX: 617-439-4170 
INFORMATION FOR SEQ ID NO: 55: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 789 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS : double 
; , TOPOLOGY: linear 

MOLECULE TYPE: cDNA 

SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
US-10-278-087A-55 



Query Match 52.6%; Score 672.2; DB 15; Length 789; 

Best Local Similarity 90.7%; Pred. No. l.le-178; 

Matches 716; Conservative 0; Mismatches 73; Indels 0; Gaps 0; 

Qy 271 GCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGGACATCACTGAG 330 

II II II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II MINIM II 
Db 1 GC C GAT GTGCTG GT GACAGC C AT CT GC CT GC CGGCC AGT CT GCT GGT AGACAT CAC GGAA 60 

Qy 331 TCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGGCTGTGTCCGTG 390 

I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 TCCTGGCTCTTTGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGGCCGTGTCCGTG 120 

Qy 391 TCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATGCCATCTGCCAC 450 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 TCAGTGGTCGTGCTGACTCTCAGCTCCATCGCCCTGGACCGCTGGTACGCCATCTGCCAC 18 0 

Qy 451 CCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGGGCATCTGGGCT 510 

II II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I 

Db 181 CCGCTGTTGTTCAAGAGCACTGCCCGGCGCGCCCGCGGCTCCATCCTCGGCATCTGGGCG 240 

Qy 511 GTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCAGTGTGCTGCCT 570 

' I I I I I I I I I I I I I I I I I I I I I I I I I II II II I II I I I II I I I II I I I I I II I 

Db 241 GTGTCGCTGGCTGTCATGGTGCCTCAGGCTGCTGTCATGGAGTGTAGCAGCGTGCTGCCC 300 



Qy 

Db 



571 
301 



630 



360 



Qy 631 TATCCCAAGATCTACCACAGTTGCTTCTTTATTGTCACCTACCTGGCCCCACTGGGCCTC 690 

II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 361 TACCCCAAGATCTACCACAGCTGCTTCTTCATTGTCACCTACCTGGCCCCACTGGGCCTC 420 

Qy 691 ATGGCCATGGCCTATTTCCAGATATTCCGC7VAGCTCTGGGGCCGCCAGATCCCCGGCACC 750 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 ATGGCCATGGCCTATTTCCAGATCTTCCGCAAGCTCTGGGGCCGCCAGATCCCCGGCACC 4 80 

Qy 751 ACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGGACCTGGAGCAG 810 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I. I I I I II III 

Db . 481 ACCTCGGCCCTGGTGCGCAACTGGAAGCGGCCCTCAGACCAGCTGGACGACCAGGGCCAG 54 0 

Qy 811 GGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAGTGAAGCAGATG 870 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I 

Db 541 GGCCTGAGCTCAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCCGAGGTGAAACAGATG 600 

Qy 871 CGTGCACGGAGGAAGACAGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGC 930 

II II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 XGAGCCCGGAGGAAGACGGCCAAGATGCTGATGGTGGTGCTGCTGGTCTTCGCCCTCTGC 660 

Qy 931 TACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGTTCCGCCAAGCC 990 

I I I I I I I I I I I I II MINIM I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 T ACCT GCC CAT CAGT GT CCT CAAC GT C CT CAAGAGG GT CTT CG G GAT GT T CC GC CAAGC C 72 0 

Qy 991 AGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAAC 1050 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 721 AGCGACCGAGAGGCCATCTACGCCTGCTTCACCTTCTCCCACTGGCTGGTGTACGCCAAC '780 

Qy 1051 AGCGCTGCC 1059 

I I I II II I 

Db 781 AGCGCCGCC 789 



RESULT 13 
US-10-282-717-1 

; Sequence 1, Application US/10282717 
; Publication No. US200300834 66A1 
; GENERAL INFORMATION: 
; APPLICANT: YANAGISAWA, MASASHI 

; TITLE OF INVENTION: cDNA CLONE MY1 THAT ENCODES A NOVEL 
; TITLE OF INVENTION: HUMAN 7 -TRANSMEMBRANE RECEPTOR 
; FILE REFERENCE: GH50029D1C1 

; CURRENT APPLICATION NUMBER: US/10/282, 717 

; CURRENT FILING DATE: 2002-10-28 

; PRIOR APPLICATION NUMBER: 09/676,625 

; PRIOR FILING DATE: 2000-10-02 

; PRIOR APPLICATION NUMBER: 09/119,788 

; PRIOR FILING DATE: 1998-07-21 

; PRIOR APPLICATION NUMBER: 60/053,790 

; PRIOR FILING DATE: 1997-07-25 

; NUMBER OF SEQ ID NOS : 2 

SOFTWARE: FastSEQ for Windows Version 3.0 



SEQ ID NO 1 
LENGTH: 1633 
TYPE: DNA 

ORGANISM: HOMO SAPIENS 
US-10-282-717-1 

Query Match 43.4%; Score 554.4; DB 15; Length 1633; 

Best Local Similarity 68.2%; Pred. No. 1.8e-145; 

Matches 819; Conservative 0; Mismatches 366; Indels 15; Gaps 3; 

Qy . . 80 AT GAAG AT GAGT TTCTCCGC TAT CTGTGGCGT GAT TAT C T G T AC C C AAAAC AGT AT GAGT 139 

I II II II II II II II I I I I I I I II II III I I I I III I I I I I I I I 
Db 217 ACGACGAGGAATTCCTGCGGTACCTGTGGAGGGAATACCTGCACCCGAAAGAATATGAGT 276 

Qy 14 0 GGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCCCTGGTGGGCAACACGCTGG 199 

I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 277 GGGTCCTGATCGCCGGGTACATCATCGTGTTCGTCGTGGCTCTCATTGGGAACGTCCTGG 336 

Qy 200 TCTGCCTGGCCGT GT GGC GGAACCAC C ACAT GAG GAC AGT CAC C AACT ACT T CAT T GT CA 259 

MINIMUM I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 337 TTT GTGTGGCAGTGT GGAAGAACCACCACATGAGGACGGTAACCAACTACTT CAT AGTCA 396 

Qy 260 ACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGG 319 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 397 ATCTTTCTCTGGCTGATGTGCTCGTGACCATCACCTGCCTTCCAGCCACACTGGTCGTGG 456 

Qy 320 ACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGG 379 

I I I I I I I I I I I I I I I I II II II I I I I I I I I I II II II I I I I I I I I I 
Db 457 AT AT C ACT GAGAC CT GGT TT T T T GGAC AGT C C CT T T GCAAAGT GATT C CTT ATCT ACAGA 516 

Qy 38 0 CTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATG 439 

I I I I I I I I I I I III I II II II II I I I I I I I I I I I I I I II I I I I I I I 
Db 517 CCGTGTCGGTGTCTGTGTCTGTCCTCACACTGAGCTGTATCGCCTTGGATCGGTGGTATG 576 

Qy 44 0 CCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGG 4 99 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I . 
Db 577 CAAT CT GT CAC CCT T TGAT GT T TAAGAGCACAGCAAAGC GGGCC C GTAACAGCAT TGT CA 636 

Qy 500 GCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCA 559 

I I I I I I I I I I II I I I I I I II I I I I I I I I I I I II I I I I I I I 

Db 637 T CAT CT G GAT T GT CT CCT GC AT T AT AAT GATT CCT CAGGCC AT C GT CAT GGAGT GCAGCA 696 

Qy 560 GTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGG 619 

I I I I I I I I I I I I I I II I I I I I I I I I II I I I II I I I I I II 

Db 697 CCGTGTTCCCAGGCTTAGCCAATA7KAACCACCCTCTTTACGGTGTGTGATGAGCGCTGGG 756 

Qy 620 CAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTTATTGTCACCTACCTGGCCC 679 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 757 GTGGT GAAATTTATCCCAAGAT GT ACCACATCTGTTT CTTT CTGGTGACATACAT GGCAC 816 

Qy 680 CACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGA 739 

I I I I I I I I I I I I I I I I I III I II I I I I I I I I I I I I I I I I I II I I I I 
Db 817 CACTGTGTCTCATGGTGTTGGCTTATCTGCAAATATTTCGCAAACTCTGGTGTCGACAGA 876 

Qy 74 0 TCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGG 7 99 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 877 T CC CT GGAACAT CAT CT GT AGT T CAGAGAAAAT GGAAGCCC C TGCAGCCTGTTT 930 



Qy 

Db 



800 
931 



859 
990 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



860 T GAAGCAGAT GC GT GCACGGAGGAAGACAGC CAAGAT GCT GAT GGTGGT GCTGCTGGT CT 919 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
991 TAAAGCAGATCCGAGCCAGAAGGAAAACAGCCCGGATGTTGATGGTTGTGCTTTTGGTAT 1050 

920 TCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGT 979 
I II I I I I I I II II II III I I I I I I I I I I II I I I I I II II I I I I I I I 
1051 T TGCAAT T T GCT AT CTAC CAATTAGCAT CCT CAAT GT GCT AAAGAGAGT AT TT GG GAT GT 1110 

980 TCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGG 1039 
I III I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I 

1111 TTGCCCATACTGAAGACAGAGAGACTGTGTATGCCTGGTTTACCTTTTCACACTGGCTTG 1170 

104 0 T GTACGC CAAC AGC GCT GCCAACC C CAT CAT CT ACAACT T C CT CAGT G GCAAAT T C CGGG 1099 

I II I I I I I II I I I I I II II II II II II II I I I I I I I I I I I I I II I 
1171 TATATGCCAATAGTGCTGCGAATCCAATTATTTATAATTTTCTCAGTGGAAAATTTCGAG 1230 

1100 AGCAGTTTAAGGCTGCCTTCTC CTGCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTC 1156 

I I I I I I I I I I I I I I I I I I II I I I I I I I III I II 

1231 AGGAATTTAAAGCTGCGTTTTCTTGCTGTTGCCTTGGAGTTCACCATCGCCAGGAGGATC 1290 

1157 TGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGAT 1216 

I II II III I I I I I I I I I I I I I I I I I I I I 

1291 GGCT CAC C AGGGGAC GAACTAGCACAGAGAGC CGGAAGT CCTT GACCACT CAAAT CAGC A 



1350 



1217 GCT- 



■ — C C GT CT CC AAAAT CT CT GAGCAT GT GGT G CT C AC C AGCGT CAC CACAGT GC 127 0 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1351 ACTTTGATAACATAT CAAAACTTT CTGAGCAAGTTGTGCT CACTAGCATAAGCACACT CC 1410 



RESULT 14 

US-10-225-567A-369 

; Sequence 369, Application US/10225567A 

; Publication No. US20030113798A1 

; GENERAL INFORMATION: 

; APPLICANT: Lifespan Biosciences 

; APPLICANT: Brown, Joseph P. 

; APPLICANT: Burmer, Glenna C. 

; APPLICANT: Roush, Christine L. 

; TITLE OF INVENTION: ANTIGENIC PEPTIDES AND ANTIBODIES FOR G PROTEIN-COUPLED 

RECEPTORS (GPCRS) 

; FILE REFERENCE: 1920-4-4 

; CURRENT APPLICATION NUMBER: US/10/225, 567A 

; CURRENT FILING DATE: 2001-12-19 

; PRIOR APPLICATION NUMBER: 60/257,144 

; PRIOR FILING DATE: 2000-12-19 

; NUMBER OF SEQ ID NOS : 2292 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 369 
LENGTH: 1843 
TYPE: DNA 

ORGANISM: Homo sapiens 
US-10-225-567A-369 



Query Match 43.4%; Score 554.4; DB 15; Length 1843; 

Best Local Similarity 68.2%; Pred. No. 1.8e-145; 

Matches 819; Conservative 0; Mismatches 366; Indels 15; Gaps 



3; 



Qy 80 AT GAAGAT GAGTT T CTC C GC TAT CT GT GGCGT GAT TAT CT GT AC CC AAAAC AGT AT GAGT 139 

I II II II II II II II I I I I I I I II II III I I I I III I I I I I I I I 
Db 428 ACGACGAGGAATTCCTGCGGTACCTGTGGAGGGAATACCTGCACCCGAAAGAATATGAGT 487 

Qy 140 GGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCCCTGGTGGGCAACACGCTGG 199 

I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 4 88 GGGTCCTGATCGCCGGGTACATCATCGTGTTCGTCGTGGCTCTCATTGGGAACGTCCTGG 547 

Qy 2 00 T CT GCCT GGCCGT GTGGCGGAACCACCACATGAGGACAGT CACCAACTACTTCATTGTCA 259 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 548 TTTGTGT GGCAGT GTGGAAGAACCACCACAT GAGGAC GGTAAC CAAC TACT T CAT AGT CA 607 

Qy 2 60 ACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGG 319 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 608 ATCTTTCTCTGGCTGATGTGCTCGTGACCATCACCTGCCTTCCAGCCACACTGGTCGTGG 667 



Qy 320 ACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGG 379 

I I I I I I I I I I I I I I I I II II II I I I I I I I I I II II II I I I I I I I I I 
Db 668 ATATCACTGAGACCTGGTTTTTTGGACAGTCCCTTTGCAAAGTGATTCCTTATCTACAGA 727 

Qy 380 CTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATG 439 

I I I I I I I I I I I III I II II II II I I I I I I I I I I I I I I II I I I I I I I 
Db 728 CCGTGTCGGTGTCTGTGTCTGTCCTCACACTGAGCTGTATCGCCTTGGATCGGTGGTATG 787 



Qy 440 CCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGG 499 

I I I I I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I III I 

Db 7 88 CAATCTGTCACCCTTTGATGTTTAAGAGCACAGCAAAGCGGGCCCGTAACAGCATTGTCA 847 

Qy 500 GCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCA 559 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 848 T CAT CT GGAT T GT CT CCT GCAT T AT AAT GAT T CCT CAGGCCAT C GTCAT GGAGT GC AGCA 907 



Qy 560 GTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGG 619 

II I I I I I I II I I I I II I I I I I I I I I II I I I I I I I I I I I I 

Db 908 C CGT GTT C CCAGGCT TAGC CAAT AAAAC CAC C CT CTT TAC G GT GT GT GAT GAGC GCT GGG 967 

Qy 620 CAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTTATTGTCACCTACCTGGCCC 679 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 968 GT GGT GAAATTT AT CCCAAGATGT ACCACAT CTGTTT CTTT CT GGT GACAT ACAT GGCAC 1027 

Qy 680 CACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGA 739 

I I I I I I I I I I I I I I I I I III I II I I I II I I I I I I I I I I I I II I I I I 
Db 1028 CACT GTGT CT CAT G GT GT T G GCTT AT CT GCAAAT ATT T CGCAAACT CT GGT GT C GACAGA 1087 



Qy 740 TCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGG 799 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I r 

Db 1088 T CCCT GGAACAT CAT CTGTAGTT CAGAGAAAAT GGAAGCCCC TGCAGC CTGTTT 1141 

Qy 8 00 ACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAG 859 

II I Ml I I I I I I I I I I I I I I I I I I I I I 

Db 1142 CACAG C CT CGAGGG CCAGGACAGC CAAC GAAGTC C C GGAT GAGC G CT GT GGC GGCT GAAA 1201 



Qy 860 T GAAGCAGAT G C GT GCAC GGAGGAAGACAGC CAAGAT GC T GAT GGT GGT GCT GCT GGT CT 919 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I I | | | | | 
Db 1202 T AAAG C AGAT C C GAGC CAGAAGGAAAAC AGC C C G GAT GT T GAT GGTTGTGCTTTT GGT AT 12 61 

Qy 92 0 TCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGT 97 9 

I I I . I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1262 T T GCAATT T GC TAT CT AC CAAT T AGCAT C CT CAAT GT GCTAAAGAGAGT ATT T GGGAT GT 1321 

Qy 980 TCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGG 1039 

I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1322 TTGCCCATACTGAAGACAGAGAGACTGTGTATGCCTGGTTTACCTTTTCACACTGGCTTG 1381 

Qy 104 0 T GT AC GCC AAC AGC GCT GC CAAC CC CAT CAT CT AC AACT T C CT CAGT GGCAAAT T C C GGG 1099 

I M I I I I I II I I I I I II II II II M II II I I I I I I I I I I I I I || | 
Db 1382 TATATGCCAATAGTGCTGCGAATCCAATTATTTATAATTTTCTCAGTGGAAAATTTCGAG 1441 

Qy 1100 AGCAGTTTAAGGCTGCCTTCTC CTGCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTC 1156 

I I I I I I I I I I I I I I I I I I I I I I I I I I I III | || 

Db 1442 AGGAATTTAAAGCTGCGTTTTCTTGCTGTTGCCTTGGAGTTCACCATCGCCAGGAGGATC 1501 

Qy 1157 TGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGAT 1216 

Ill I.I I I I I I I I I I I I I I Mill 

Db 1502 GGCTCACCAGGGGACGAACTAGCACAGAGAGCCGGAAGTCCTTGACCACTCAAATCAGCA 1561 

Qy 1217 GCT- CC GT CTC CAAAAT CT CT GAGC AT GT GGT GCT CAC CAGC GTC ACC ACAGT GC 127 0 

II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I M I I 

Db 1562 AC T T T GATAACATAT CAAAAC TT T CT GAGCAAGTTGT GCT CACTAGC ATAAGC ACACT CC 1621 



RESULT 15 
US-09-826-509-550 

; Sequence 550, Application US/09826509 
; Publication No. US20030204073A1 
; GENERAL INFORMATION: 

; APPLICANT: Lehmann-Bruinsma, Karin 
; APPLICANT: Liaw, Chen W. 
; APPLICANT: Lin, I-Lin 

; TITLE OF INVENTION: No. US20030204073A1-Endogenous , Constitutively Activated 
Known G 

; TITLE OF INVENTION: Protein-Coupled Receptors 
; FILE REFERENCE: AREN-2 07 

; CURRENT APPLICATION NUMBER: US/09/82 6, 509 

; CURRENT FILING DATE: 2001-04-05 

; PRIOR APPLICATION NUMBER: 60/195,747 

; PRIOR FILING DATE: 2000-04-07 

; PRIOR APPLICATION NUMBER: 09/170, 496 

; PRIOR FILING DATE: 1998-10-13 

; NUMBER OF SEQ ID NOS : 589 

; SOFTWARE: Patentln Version 2.1 

; SEQ ID NO 550 

LENGTH: 1335 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-09-826-509-550 



Query Match 

Best Local Similarity 



43.0%; 
68.0%; 



Score 549.6; DB 11; Length 1335; 
Pred. No. 3.7e-144; 



Matches 816; Conservative 0; Mismatches 369; Indels 15; Gaps 3; 



Qy 80 AT GAAGAT GAGT TT CT C C GCT AT CT GT GGCGT GAT TAT C.T GT AC C CAAAAC AGTAT GAGT 139 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 104 AC GAC GAGGAAT T CCT GC GGT ACCT GT GGAGGGAATAC CT GCACC C GAAAGAAT AT GAGT 163 

Qy 14 0 GGGTCCTCATCGCAGCCTATGTGGCTGTGTTCGTCGTGGCCCTGGTGGGCAACACGCTGG 199 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I i I I I I 

Db 164 GGGTCCTGATCGCCGGGTACATCATCGTGTTCGTCGTGGCTCTCATTGGGAACGTCCTGG 223 

Qy 200 TCTGCCTGGCCGTGTG GC GGAAC CAC CAC AT GAGGAC AGT CAC CAACT ACT T CAT T GT C A 259 

III I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 224 T T T GT GT GG C AGT GT GGAAGAAC CAC CAC AT GAGGAC G GT AAC CAACT ACT T CAT AGT C A 2 83 

Qy 260 ACCTGTCCCTGGCTGACGTTCTGGTGACTGCTATCTGCCTGCCGGCCAGCCTGCTGGTGG 319 

I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 2 84 ATCTTTCTCTGGCTGATGTGCTCGTGACCATCACCTGCCTTCCAGCCACACTGGTCGTGG 343 

Qy 32 0 ACATCACTGAGTCCTGGCTGTTCGGCCATGCCCTCTGCAAGGTCATCCCCTATCTACAGG 379 

I I I I I I I I II I I I I I I II II II I I I I I I I II MM II I I I I I I I I I 
Db 344 AT AT CACT GAGACCT GGT TTT T T G GAC AGT C CCTT TGCAAAGT GATT C CTT AT CT ACAGA 403 

Qy 38 0 CTGTGTCCGTGTCAGTGGCAGTGCTAACTCTCAGCTTCATCGCCCTGGACCGCTGGTATG 4 39 

I II I I I II I I I Mill! II II II I I I I I I I I I I Ml || I I I I I II 

Db 4 04 CCGTGTCGGTGTCTGTGTCTGTCCTCACACTGAGCTGTATCGCCTTGGATCGGTGGTATG 4 63 

Qy 44 0 CCATCTGCCACCCACTATTGTTCAAGAGCACAGCCCGGCGGGCCCGTGGCTCCATCCTGG 4 99 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I 
Db 464 CAAT CT GT CAC CCTT T GAT GTTTAAGAGCACAGCAAAGC GGGCC C GT AACAGCAT T GT C A 523 

Qy 500 GCATCTGGGCTGTGTCGCTGGCCATCATGGTGCCCCAGGCTGCAGTCATGGAATGCAGCA 559 

I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 

Db 524 T CAT CT GGATT GT CT CCT GCATTAT AAT GATT CCT CAGGCCAT CGT CAT GGAGT GCAGCA 583 

Qy 560 GTGTGCTGCCTGAGCTAGCCAACCGCACACGGCTCTTCTCAGTCTGTGATGAACGCTGGG 619 

I I I I I I I I I I I I I I II I I I II IM I I I I I I I I I I I I I I I I 

Db 584 CCGTGTTCCCAGGCTTAGCCAATAAAACCACCCTCTTTACGGTGTGTGATGAGCGCTGGG 643 

Qy 620 CAGATGACCTCTATCCCAAGATCTACCACAGTTGCTTCTTTATTGTCACCTACCTGGCCC 679 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 64 4 GTGGTGAAATTTATCCC7U\GATGTACCACATCTGTTTCTTTCTGGTGACATACATGGCAC 7 03 

Qy 68 0 CACTGGGCCTCATGGCCATGGCCTATTTCCAGATATTCCGCAAGCTCTGGGGCCGCCAGA 739 

I I I I I I I I II I I I I I I I III I II I I I I I I Ml I I I I I I I I II II I I 

Db 7 04 CACT GT GT CT CAT GGT GT T GGCTT AT C T GCAAAT ATT T C GCAAACT CT GGT GT C GACAGA 7 63 

Qy 74 0 TCCCCGGCACCACCTCAGCACTGGTGCGGAACTGGAAGCGCCCCTCAGACCAGCTGGGGG 799 

I I I I I I I I I II I I I I I II I II II II II I II I I 

Db 764 T CC CT GGAACAT CAT CT GTAGT T C AGAGAAAAT GGAAGCC CC TGCAGCCTGTTT 817 

Qy 8 00 ACCTGGAGCAGGGCCTGAGTGGAGAGCCCCAGCCCCGGGCCCGCGCCTTCCTGGCTGAAG 859 

II I II I I I I I I I I I I Mill I I M II I 

Db 818 CACAGCCTCGAGGGCCAGGACAGCCAACGAAGTCCCGGATGAGCGCTGTGGCGGCTGAAA 877 

Qy 860 T GAAGCAGAT GCGT GCACGGAGGAAGACAGCCAAGAT GCTGAT GGT GGT GCT GCTGGT CT 919 

I I I I I I I I I I I I I I I I I I I I I I I I II II M I I I II I II I II I I 

Db 878 T AAAGCAGAT C C GAGC C AGAAGGAAAAC AAAAC GGAT GT T GAT GGTT GT GC T T T T GGT AT 937 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



920 TCGCCCTCTGCTACCTGCCCATCAGCGTCCTCAATGTCCTTAAGAGGGTGTTCGGGATGT 979 

I II I I I I I I II II II III I I I I I I I I I I II I I I I I II II I I II I I I 

938 TTGCAATTTGCTATCTACCAATT AGCATCCT CAATGT GCTAAAGAGAGTATTT GGGAT GT 997 

980 TCCGCCAAGCCAGTGACCGCGAAGCTGTCTACGCCTGCTTCACCTTCTCCCACTGGCTGG 1039 

I III I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I 

998 TTGCCCATACTGAAGACAGAGAGACTGTGTATGCCTGGTTTACCTTTTCACACTGGCTTG .1057 

1040 T GT AC GC CAACAGC GCT GC CAAC C C CAT CAT CT ACAACT T C CT C AGT G GCAAAT T C C GG G 1099 

I II I I I I I II I I I I I II I I II II II II II I I I I I I I I I II I I II I 

1058 TAT AT GC CAAT AGT GCT GC GAAT C CAAT T AT TT ATAATTT T CT C AGT GGAAAAT TT C GAG 1117 

1100 AGCAGTTTAAGGCTGCCTTCTC CTGCTGCCTGCCTGGCCTGGGTCCCTGCGGCTCTC 1156 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II 

1118 AGGAATTTAAAGCTGCGTTTTCTTGCTGTTGCCTTGGAGTTCACCATCGCCAGGAGGATC 1177 

1157 TGAAGGCCCCTAGTCCCCGCTCCTCTGCCAGCCACAAGTCCTTGTCCTTGCAGAGCCGAT 1216 

I II II III I I II I I I I I I I I I I I I I I I I 

1178 GGCT CACCAGGGGAC GAACT AGC ACAGAGAGC CGGAAGTC CTT GAC CACT CAAAT CAGCA 1237 

1217 GCT CCGTCTCCAAAATCTCTGAGCATGTGGTGCTCACCAGCGTCACCACAGTGC 1270 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
1238 ACTTT GAT AACAT ATCAAAACTTT CT GAGCAAGTT GT GCT CACT AGCATAAGCACACTCC 1297 



Search completed: October 16, 2004, 03:40:39 
Job time : 663.145 sees 



